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CNj , We introduce a version of the cavity method for diluted mean-field spin models that allows the 

» I ■ computation of thermodynamic quantities similar to the Franz- Parisi quenched potential in sparse 

^ 1' random graph models. This method is developed in the particular case of partially decimated 

, random constraint satisfaction problems. This allows to develop a theoretical understanding of 

a class of algorithms for solving constraint satisfaction problems, in which elementary degrees of 
04 ' freedom are sequentially assigned according to the results of a message passing procedure (belief- 

propagation). We confront this theoretical analysis to the results of extensive numerical simulations. 

pH , I. INTRODUCTION 

' , In constraint satisfaction problems (CSP) a set of variables is required to simultaneously satisfy a series of con- 
. ' straints. One can equivalently define an energy function as the number of unsatisfied constraints of a given assignment 
"j^ . of the variables, and rephrase the CSP as the quest for a zero energy groundstate configuration. This analogy with 
low temperature physics triggered an intensive research effort within the statistical mechanics community. More 
i' \ precisely, one line of approach for these problems consists in looking for typical properties of randomly generated 
'■^ ■ large instances. This translates into the presence of quenched disorder in the corresponding physical model. The 
^ ■ constraints to be satisfied are generally contradicting each other, and the definition of the random instances does not 
O , involve an underlying finite dimensional space; in consequence these problems fall into the category of mean-field spin 
1^1 ' glasses, for which a set of analytical tools have been developed during the last decades [1-3]. 

The most famous example of random CSP is the random fc-sat ensemble. Statistical mechanics studies have 
' led to two kind of results for this problem. On the one hand, qualitative and quantitative predictions have been 
[ made about the various phase transitions encountered for the typical behaviour of large instances when the control 
' parameter governing the amount of frustration is varied. The satisfiability transition marks the sudden disappearance 
' of the solutions (zero-energy groundstates). There exist rigorous results on the properties of this transition [4] and 
bounds [5, 6] on its possible location. Statistical mechanics has complemented these results with an heuristic way to 
compute this threshold hence yielding quantitative conjectures on its value [7-9]. Another important contribution has 
been the suggestion of other phase transitions in the satisfiable regime, that concerns the geometrical organization 
' of the set of solutions inside the configuration space [7, 8, 10, 11]. For large enough frustration, but below the 
, satisfiability transition, the solutions can be grouped in clusters of nearby solutions, each cluster being separated from 
the others. 

On the other hand attention has also been paid to algorithmic issues, that is to procedures aiming at solving CSP, 
by finding their solutions or proving that no solution exist. These algorithms can be roughly divided in two broad 
5—1 ■ categories: local search and sequential assignment procedures. In the first one, which has also been studied with 
statistical mechanics methods [12-14], a random walk is performed in the configuration space, with transition rules 
tuned to bias the walk towards the solutions. This kind of algorithm is called incomplete: it cannot prove the absence 
of solution if it fails to find one. The second category proceeds differently: at some step of the algorithm only one 
part of the variables has a definite value, the others being still free. Each step thus corresponds to the choice of one 
free variable and of the value it will be assigned to, the CSP on the remaining variables being consequently simplified. 
The heuristics guiding these choices can be more or less elaborate. In the simplest cases one only takes into account 
simple properties of the free variables, such as the number of their occurences in the remaining CSP. A rigorous 
analysis of these simple ("myopic") approaches is possible and is at the basis of most of the lower bounds on the 
satisfiability transition [5]. Such algorithms can be made complete if backtracking is allowed, i.e. choices which have 
led to a contradiction can be corrected in a systematic way. This complete version of the procedure is called the 
DPLL algorithm [15]. 

One outcome of the statistical mechanics studies of random CSP has been the proposal of an incomplete sequential 
assignment algorithm called Survey Propagation inspired decimation [7, 16], which proved to be very efficient on 
satisfiable random instances close to the satisfiability transition. This algorithm relies on the clustering picture of 
the solution space in the satisfiable regime. Unfortunately its theoretical analysis is much more difficult than for 
myopic ones. Indeed the heuristics of choice of the variables to be assigned is based on the result of a message passing 
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iterative computation which depends on the whole remaining CSP in an intricate way. More generally the analysis of 
message-passing decimation procedures is difficult and there are few results on this issue, with the notable exception 
of [17-19] for the so-called Warning Propagation algorithm on overconstrained satisfiability formulas. 

In this paper wc provide an analytical description of an algorithm similar to Survey Propagation, yet simpler. It 
has been studied numerically in [20, 21]. A part of our results were published in [22], the method developed there was 
also applied to another family of CSP in [23]. In the sequential assignment procedure under investigation the choice of 
the value of the assigned variable is made at each step according to the Belief Propagation message passing algorithm 
(instead of Survey Propagation). It aims at mimicking the following ideal procedure. After a certain number of 
variables has been assigned, one can define the uniform probability measure over the solutions of the CSP which arc 
still compatible with the previous choices. If one were able to compute the marginal probabilities of this (conditional) 
probability measure and use them to draw the value of the newly assigned variable at each step, one would construct 
an uniform sampler of the solutions of the original CSP, and this would in particular lead to an algorithm for finding 
one solution of the CSP. The computation of these marginal probabilities is a computationally intractable task; Belief 
Propagation is a fast heuristic algorithm, widely used for inference problems [24, 25] , which is often able to compute 
good approximations of these marginal probabilities. Analyzing the behaviour of the Belief inspired decimation 
procedure thus amounts to control the error which accumulates at each step by using the BP approximate estimates 
of the marginal probabilities instead of the exact ones. A theoretical understanding and quantitative description 
of the deviations between exact and BP-computcd marginal probabilities for graphical models is a formidable open 
problem that wc shall not attack directly in this paper. Wc will instead perform a theoretical analysis of the putative 
algorithm based on an hypothetic exact marginal computation. This analysis will be obtained by a generalization 
of the cavity method which is able to deal with the partially decimated CSP encountered along the execution of the 
algorithm, and to compute the extended phase diagram of these problems. This approach is technically similar to the 
computation of Franz-Parisi quenched potentials [26] . The relevance of this theoretical analysis for the understanding 
of the approximate BP implementation will then be argued on the basis of a comparison with extensive numerical 
simulations. 

The paper is organized as follows. In Sec. II wc give a more precise definition of the ideal decimation procedure 
sketched above and explain how an approximate realization of this idea can be performed in practice. Sec. Ill is 
devoted to the cavity method for decimated formulas that provides an analytical description of the ideal decimation 
procedure. In the next two sections we apply this formalism to two specific CSP, and compare its predictions to the 
results of numerical simulations of the BP guided decimation algorithm. We begin in Sec. IV with the xor-satisfiability 
problem, a well-studied simple example for which many results can be checked with alternative techniques. We then 
turn to the case of ^-satisfiability random instances in Sec. V. We draw our conclusions in Sec. VI. More technical 
details are deferred to an Appendix. 



A Constraint Satisfaction Problem (CSP) is defined on a set of N variables ci, . . . ,<Jn, taking values in a finite 
alphabet. We shall denote = (cri , . . . , ctn) the global configuration of the variables, and for a subset S of the indices 
{1, . . . , A^} we call ffg = {cr^, i e S"} the partial configuration of the variables in S. The solutions of the CSP are the 
configurations which simultaneously satisfy M constraints (also called clauses in the following), each of them being 
specified by a function ipa{£sa) ^ subset da of the variables. The function ipa takes value 1 (resp. 0) whenever the 
constraint is satisfied (resp. unsatisfied). A CSP admits a natural representation in terms of a factor graph [24], i.e. 
a bipartite graph where one type of vertex (variable node) is associated to each variable i = I, . . . , N, another type 
(function node) to each constraint a = 1, . . . , M. An edge links the i'th variable node with the a'th function node 
whenever the constraint ipa depends on cr^, i.e. in the notation introduced above whenever i G da. We shall similarly 
denote di the set of function nodes which depend on the i'th variable, and define the distance between two variable 
nodes i and j as the minimal number of constraint nodes encountered on a path of the factor graph joining i and j. 

In the following we will concentrate on two examples of CSP, both on binary variables that we shall represent by 
Ising spins, Ui = ±1: 

• fc-xorsat. Each constraint a depends on k distinct variables da = {i", . . . and requires the product of the 
corresponding spins to take a given value € {— 1,-|-1}: 



II. A THOUGHT EXPERIMENT 



A. Definition of random CSPs and a brief review of their properties 
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where here and in the following I(-) is the indicator function of an event. This condition is easily seen to be 
equivalent to a constraint on the value of the eXclusive OR of k boolean variables, hence the name of the 
problem. 

• fc-sat. The constraint a depends again on da = ... but imposes the configuration of these k variables 
to avoid one out of the 2'' possible ones, 



logical OR of k literals (a boolean variable or its negation) to evaluate to TRUE. 

Prom a computational complexity point of view these two problems are very different. The decision version of a 

CSP consists in determining whether it admits at least one solution, i.e. one configuration satisfying all constraints 
simultaneously. The fc-xorsat decision problem belongs to the easy, polynomial P complexity class [27] for any value 
of k. One can indeed use Gaussian elimination to check if the associated system of linear equations modulo 2 is 
solvable, fc-sat is on the contrary NP-complete for all fc > 3: no algorithm able to decide the satisfiability of every 
fc-sat formula in a time bounded by a polynomial of the formula size is known. 

Despite this deep difference in the worst-case point of view, these two families of problems share common features in 
their "average complexity" behavior. By this we mean the random ensembles of instances that have been extensively 
studied in the computer science and statistical physics literature and that are defined as follows. A random fc-xorsat 
formula is generated by drawing in an independent, identical way M constraints; the fc-uplet of indices da is drawn 
uniformly among the (^) possible ones, and the coupling constant J"^ is taken to be ±1 with equal probability one 
half. The generation of a random fc-sat formula is similar, with for all constraints a the k constants { Jf }ieaa being 
taken independently equal to ±1 with equal probability. These random ensembles exhibit a rich phenomenology in 
the thermodynamic limit N,M ^ oo with a — M/N fixed. In particular a satisfiability phase transition occurs at a 
value as (which depends on the value of k and on the problem, sat or xorsat, under consideration): random formulas 
with a < as are, with high probability, satisfiable, whereas for a > ag they are unsatisfiable. Here and in the following 
"with high probability" (w.h.p.) means with a probability going to one in the above stated thermodynamic limit. 
To be more precise;, for xorsat this statement has been proven and the values of a.s have been computed [28, 29]. 
For sat random formulas this satisfiability transition is, strictly speaking, only a conjecture. The existence of a tight 
threshold as{N) has been proven in [4], but not the convergence of as{N), that could in principle oscillate between 
the bounds on its possible location [5, 6]. It is however most probable that the values of as computed within the 
statistical mechanics framework [7-9] are exact. 

Besides the satisfiability transition other interesting phenomena occurs in the satisfiable phase a < as- In this 
regime the formula are w.h.p. satisfiable and in fact admits an exponential number of solutions; however there are 
structural phase transitions at which the properties of the set of solutions change qualitatively. To describe the set of 
solutions it is convenient to introduce the uniform probability measure on this set. 



where the normalizing factor Z is the number of solutions of the CSP. Note that this probability measure is itself a 
random object, as it depends on the instance of the CSP under study, and that it is defined only for the satisfiable 
instances, which is the case w.h.p. in the regime a < as we are considering here. 

In the xorsat ensemble of random formulas there is a single structural phase transition in the satisfiable regime [28, 
29], known as the clustering transition with a threshold denoted ad. For lower values of the connectivity a, the set 
of solutions is well-connected. On the contrary for aa < a < as the exponential number of solutions gets splitted 
in an exponential number exp[A^S] of clusters, separated from each other in the configuration space. The rate of 
growth S of the number of clusters is usually called the complexity, it decreases when a grows and vanishes at the 
satisfiability threshold. In the clustered phase each cluster contains the same exponential number of solutions (with 
a rate of growth called internal entropy). The structure of xorsat is sufficiently simple for a clear-cut definition of 
the clusters to be possible. In fact the clustering transition corresponds to a percolation transition in the associated 
factor graph, where an extensive 2-core discontinuously appears. 

The structure of the satisfiable phase of the random satisfiability ensemble is richer [11, 30]. At the clustering 
transition [7, 10] the exponentially numerous clusters have, contrarily to xorsat, a large diversity of internal entropies. 
This leads, for > 4, to another transition, the condensation one at ac. When ad < a < ac the measure (3) is splitted 
in an exponential number of clusters, while for ac < a < as almost all solutions are contained in a sub-exponential 
number of clusters. These various phase transitions can be characterized in terms of the strength of the correlations 
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between variables. The clustering is related to the appearance of long-range point-to-set correlations, or in other 
words to the possibility of reconstruction of the value of a variable given the values of all the variables at a large 
distance from it [31]. At the condensation transition non trivial correlations are already revealed by the correlation 
functions between a finite number of variables [11]. 



The study presented in this paper is based on the analysis of an ideal procedure to find the solutions of a CSP, that 

we discuss here more precisely. Consider a satisfiable CSP instance, and the uniform measure /x(-) over its solutions 
defined in (3). Let us also introduce a subset D of the variables, and a partial configuration of these variables Tjj 
compatible with at least one solution of the instance. We can thus define a conditional version of n, /u(-|t£)), which 
is the uniform measure over the solutions of the formula compatible with r^,: 



The normalization Z{tjj) counts the number of solutions compatible with the partial assignment of the variables in 



A possible procedure for sampling from /i(-) goes as follows. Choose arbitrarily a permutation of {1, . . . , N}, denoted 
. . . , i{N), and call Dt = {z(l), . . . , i{t)} for f = 1, . . . , A^, Dq = 0. Construct now sequentially a configuration 
r, assigning at time t the value Ti(^t)] to this aim draw fTj(t) according to the marginal of the conditioned measure 
h{-\tjj^_^), and set Tj^j) = cri(t). It is easy to see that after t steps of the algorithm the partial configuration Tjj^ 
is distributed according to the marginal law of /u(-). In particular the final configuration t obtained when the A'' 
variables are assigned is a uniformly chosen solution of the CSP. 

This simple algorithm would thus provide a uniform sampler of the solution set of any CSP; it is however only 
meant as a thought experiment. Indeed, computing exactly the probabilities M('''i(t) l2lDt_i) general a t^P- 

complete problem, with no polynomial algorithm known until now, and we shall thus content ourselves with faster 
yet approximative means for computing these marginal probabilities. Before introducing them let us discuss further 
the idealized procedure. 

The analytical description of the dynamics followed by this ideal process seems very difficult: at each time step 
the probability of the evolution T£,^_^ — > Tjj^ depends in a non-trivial way on all the choices made in the previous 
steps. However the description of the process at a given point of its evolution is very simple. As noted above t^,^ 
is distributed according to the marginal of /i(-). One can state this in a slightly different way: r^^ can be obtained 
by drawing uniformly a solution r from /x(-), retaining the configuration of the variables in Dt, and discarding the 
rest of the configuration. We shall further assume that the permutation i(l), . . . , i(A'') is drawn uniformly at random, 
such that Dt is a random set of t variables among A^. In the thermodynamic limit we shall define 6 = t/N, the 
fraction of assigned variables, and consider for simplicity that -Dew is built by retaining independently each variable 
with probability 9 (wc only make an error of order on the fraction of variables thus included \n D). 

These considerations lead us to the definition of an ensemble of CSP instances parametrized by a and 0, generalizing 
the original one which corresponds to ^ = 0. Explicitly this ensemble of formulas corresponds to the following 
generation process: 

1. draw a satisfiable CSP with parameter a 

2. draw a uniform solution r of this CSP 

3. choose a set D by retaining each variable independently with probability 6 

4. consider the residual formula on the variables outside D obtained by imposing the allowed configurations to 

coincide with r on D. 

Let us emphasize that, apart from simple cases like the xorsat model, these ensembles do not coincide in general with 
randomly uniform formulas conditioned on their degree distributions. The fact that the generation of the configuration 
r depends on the initial CSP induces non-trivial correlations in the structure of the final formula. 

We shall see in the following how to adapt the statistical mechanics techniques to compute the typical properties 
of suc;li generalized formulas, and in particular to determine the phase transition thresholds in the (a, 6) plane. One 
characterization of these random ensembles is the quenched average residual entropy, 



B. Oracle guided algorithm and ensemble of decimated CSPs 
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where the throe expectation values correspond to the three steps of the definition above. This quantity is similar, 
yet distinct, from the Franz- Parisi quenched potential [26]. The definition of the latter also involves a "thermalized" 
reference configuration r, but is given by the free-energy of the measure on the configurations at a given Hamming 
distance from t. In other words the two real replicas g_ and t are coupled uniformly across the variables in Franz-Parisi 
quenched potential, whereas in the definition of oj they are coupled infinitely strongly on D where they are forced to 
coincide, and not at all outside D. The computations presented in the rest of the paper can however be easily adapted 
to obtain the usual quenched potential. 

We shall characterize the reduced measure ij,(-\tjj) more precisely by computing other quantities besides u}{6). The 
existence of clusters in this measure will be tested by the computation of the long-range point-to-set correlations and 
the complexity of the typical clusters. 



C. Bethe-Peierls approximation for decimated CSPs 

We recall in this section the Bethe-Peierls approximation for statistical models defined on factor graphs and show 
how to adapt it to partially decimated CSPs. Let us first consider a probability measure with a weight funcion which 
can be factorized as in Eq. (3), with ipa some a priori arbitrary positive functions. The Bethe approximation for the 
computation of the partition function Z consists in extremizing the following expression, 



\nZ = - In ( ^:/a^i(o-i)%^a(cri) j -F^ln ^Va(o:a„) 7?i^a(<7i) + ( 5Z 11 '^''^^(^^ 



(6) 



over the unknown ^lya^i^) f]i^a 

}. These are probability measures on the alphabet of cTj, defined on the directed edges 
of the factor graph, which we shall call messages for reasons that will become clear below. The extremization of the 
Bethe approximation for In Z leads to a set of equations between the messages, 

t'a^i(o-i) = fiiVj^ajjedaXd ; Vi^aic^i) = g{{l'b^i}bedi\a) , (7) 

where the (edge-dependent) functions / and g are defined by 

with Za^i and Zj^a ensuring the normalization of Va^i and Vi^a- When the factor graph is a tree the log partition 
function is exactly given by (6) evaluated on the unique solution of the stationarity equations (8), see for instance [24]. 
The messages Va^i (resp. rji^a) arc then the marginal probabilities for ai of a modified measure corresponding to a 
factor graph where all factor nodes around i except a have been removed (resp. only a has been removed). From the 
knowledge of the messages solution of (8) one can compute the marginal probability of the variables in the full factor 
graph law (3), for instance the marginal probability of variable i reads 

- II l^a^ziai) , (9) 

with again Zi fixed by normalization. In general factor graphs do contain loops, in that case (6,8,9) are only approxi- 
mations, at the basis of the so-called Belief Propagation algorithm discussed in more details below. 

The Bethe approximation can be easily adapted to the case where the configuration is forced to the value Tjj on 
a subset of the sites i G D, that is to the conditional measure (4). The estimation of the conditioned log partition 
function follows from (6): 

lnZ(r^) = - Y lnfE^a-(^0^-a(^0)+ElMEV'a(^a)n^-^ ' 
iiD,aedi \ Ti J a \gLea i^da J i(D \ cr, aedi / 

(10) 

where the messages {r/"!^^, depend on the imposed partial configuration t^. They indeed obey the same 

equations (8), complemented with the boundary conditions rjf'_^g^{ai) = 5cri,Ti when i G D. 
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D. Practical approximate implementation of the thought experiment 

The ideal sampling algorithm described in Sec. II B cannot be practically implemented, because the computation 
of the marginals of the probability law /i(<ljT^) has generically a cost exponential in the number of variables. One 
can however mimic this procedure, using a faster yet approximate estimation of the marginals of mC^iIZd) by means of 
the Belief Propagation algorithm. This modification of the ideal sampler, which will be called BP guided decimation 
in the following, thus corresponds to (for a given CSP instance): 

1. choose a random order of the variables, . . . , i{N), call Dq = $, Dt = {i(l), . . . , i{t)} 

2. for t= 1,...,7V: 

(a) find a fixed point of the BP equations (7) with the boundary conditions rii^a{(^i) = ^cri,Ti when i G .Dt-i 

(b) draw crj(t) according to the BP estimation of At(o'i|r£)^_ J given in (9) 

(c) set ri(t) = C7j(t) 

The Belief Propagation part of the algorithm corresponds to step 2. (a). It amoimts to search for a stationary 
point of the Bethe approximation for the log partition function, in an iterative manner. The unknowns of the Bethe 
expression, rji^a (resp. t'a^i), are considered as messages passed from a variable to a neighboring clause (resp. from a 
clause to a variable). In a random sequential order a message, say r?,;^,,, updates itself by recomputing its value from 
the current messages sent by its neighbors {i^b^i}bedi\aj according to the equation in (8). If the factor graph of the 
formula were a tree, these iterations would converge in a finite number of updates to the unique fixed point solution 
of (8). On generic factor graphs there is no guarantee of convergence of these iterations, in practical implementations 
one has thus to precise the definition of the algorithm, giving criterions to stop the iterations of the BP updates, we 
shall come back to this point in Sec. VE. 

The definition of the probability measure conditioned on the choice of the reference configuration Tjj (4) and the 
subsequent derivation of the BP equations only make sense if the formula admits at least one solution compatible with 
Tjy. In the analysis of the ideal algorithm this is automatically the case as soon as the initial formula is satisfiable. 
However this can fail in the course of the BP guided decimation algorithm because the marginals used to generate 
the configuration r are only approximate. The BP equations are no longer well defined when there are no solutions 
of the formula compatible with r^, . This shows up for instance in the computation of the message sent by a variable 
i to a clause a; whenever the product n6eai\a ^b^ii^i) vanishes for all possible values of cr^ the message rji^a can no 
longer be normalized, a contradiction has occured between the strong requests imposed by the clauses in di \ a. The 
BP guided decimation algorithm has then to stop and fails to construct a solution of the formula. 

This mechanism which unveils the contradictions in the choice of r^, and the fact that no solution is compatible 
with it, is actually equivalent to the Unit Clause Propagation (UCP) algorithm well known in computer science. For 
concreteness let us recall its functioning in the case of satisfiability formulas. UCP takes in input a list of variables 
and a list of clauses. If all clauses have length greater or equal to 2 it stops. Otherwise it chooses one of the unit 
clauses (i.e. of length 1). The variable in this unit clause must be fixed to the value satisfying the clause for the 
constructed configuration of variables to be a solution of the formula. The logical implications of this assignment 
are then drawn. All the clauses where the fixed variable appeared with the same sign as in the unit clause can be 
removed from the formula, as they are automatically satisfied. All clauses where it appeared with the opposite sign 
are effectively reduced in length, as the fixed variable will never satisfy them. This process is iterated as long as unit 
clauses are present in the formula. If clauses of length are produced during the propagation of logical implications, 
then the input formula was not satisfiable: the logical implications required at least one of the variables to take 
simultaneously its two possible values. If no contradictions occur, the set of variables that appeared in unit clauses 
during the process are termed logically implied, their value being uniquely determined by the input formula. 

It turns out that an equivalent formulation of the UCP rule for drawing logical implications can be given in terms of 
a message passing procedure known as Warning Propagation (WP) [16]. The messages {rij^aiOa^i} that WP sends 
along edges of the factor graph are projections of those of BP, where the only informations retained is whether a value 
of the variable cfi is authorized (rji^ai'^i) > 0) or not {rji^aio'i) = 0): 

Ui^aicTi) = I{rii^a{cri) > 0) , K^i{(Ti) = I(i/a^i(cri) > 0) . (11) 

The projection of the BP equations (7) leads to recurrence equations on the WP messages, 

fa^j = f({%'^a}jeaoV) , Tlj^o = 3{{^b^i}bedi\a) ■ (12) 

Monotonicity arguments can be used to show that these recurrence equations, initialized with the permissive value 
of the messages nj^o(o'i) = 1, converge to a unique fixed point independent on the order of updates of the messages. 
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Moreover this fixed point contains the same information as revealed by UCP: a contradiction occurs in UCP if and 
only if there is a variable i such that 

Yl t<a^i{ai) = V . (13) 

If no contradiction occurs the set of variables logically implied by UCP corresponds to the variables i such that there 
is only one authorized value for it, 

3\ai : H tJa^iiai) = 1 . (14) 

This corrcspondancc was explicitly shown in the case of satisfiability formulas in [22] . In our practical implementation 
of the BP guided decimation algorithm we check, after each assignment of a variable Ti(t), whether a contradiction 
in the partial configuration r^,^ can be detected by UCP /WP. According to the pseudocode above we do not imme- 
diately assign the variables which are logically implied by r^,^ ; please note however that this does not modify at all 
the subsequent steps of the algorithm, as the BP equations effectively take into account the effect of these logical 
implications. We also emphasize the fact that in general there are variables which can only take one value under the 
law ij.{-\tjj), yet that are not unveiled as logically implied by UCP/WP; if this were the case UCP would always be 
succesful on any satisfiable formula (and incidentally one would have P=NP), which is of course well known to be 
wrong. 

As a final remark on the BP guided decimation algorithm, let us emphasize another difference with the theoretical 
analysis. In practice we apply the algorithm to uniformly generated formula of CSP, with a < Ug, which are typically 
satisfiable in the thermodynamic limit, but we cannot systematically exclude unsatisfiable instances as in the analysis 
of the ideal algorithm. 



III. ANALYSIS OF THE THOUGHT EXPERIMENT WITH THE CAVITY METHOD 



A. Reminder on the usual cavity method 



The goal of the cavity method is to compute the typical properties in the thermodynamic limit of graphical 
models defined on random factor graphs, and in particular the average entropy of the associated random CSP. In the 
simplest situation, known as the replica symmetric (RS) case, the hypothesis of the method is that the Bethe-Peierls 
approximation is asymptotically exact for the large random factor graphs of the ensemble considered. For concreteness 
we explain it in a setting encompassing the fc-(xor)sat formulas, that is where each of the M = aN constraints imply 
k randomly chosen variables. The prediction for the average log partition function then follows by averaging (6), 



lim — EflnZl = - akE 



+ aE 



In 



ipiai, . . . ,crfc)r?i(o-i) . ..rikicTk) 



■E 



In 



f;(cr) 



(15) 



Let us detail the justification and the meaning of the right hand side of this relation. We have first used the 
translational invariance in the definition of the random ensemble: each term in the sums of Eq. (6) contributes on 
average in the same way. Moreover we have introduced the random variable 77 (resp. v), whose distribution can be 
constructed as follows: drawing a random factor graph, finding the solution of the BP equations (7), picking a random 
edge i — a oi the factor graph, and setting rj = rji^a (resp. v = Va^i)- The averages in the right hand side of (15) are 
thus over independent copies of -q and v, over the random constraint function ip, and over a Poisson random variable 
I of mean ak. This last quantity is the degree of an uniformly chosen variable inside such a factor graph. 

The equations fixing the distributions of the random variables rj and v can be obtained either looking for the 
stationary points of (15) or interpreting the BP equations (7) in the random graph perspective. Both reasoning leads 
to the distributional equations 



/(?7i,...,77fc_i) , r] = g{vi,...,vi) ■ 



(16) 



These equations have to be interpreted as equalities between distributions of random variables. The 77, (resp. f,) are 
independent copies of the random variable 77 (resp. J^), Hs a Poissonian random variable of parameter ak, and / and 
g have been defined in (7,8), / being itself random because of the choice of the constraint function ij). 
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The RS cavity method is based on the assumption of asymptotic correctness of the Bethe-Peierls approximation, 
which can be rephrased as the existence of a single pure state (or cluster) in the probability law /x. A complementary 
statement of the RS method concerns the local description of the law /x. Let us consider an arbitrary variable node to, 
and its depth L neighborhood, that is the set of variables at graph distance smaller or equal than L from Iq. In such 
random graph ensembles this neighborhood is, with high probability, a Galton- Watson random tree with Poissonian 
offspring of mean ak for the variable nodes (the constraint nodes being of course always of degree k). In the RS case 
the marginal of the law /i for the variables in this neighborhood converges in distribution to the law of a finite tree 
of depth L, the only effect of the rest of the graph being summarized in a single message rj acting on the boundary 
(the depth L variables), drawn independently with the fixed point distribution solution of Eq. (16). In other words 
the depth L variables would be considered independent if the constraints inside the depth L neighborhood around iq 
were removed. 

In the context of random CSPs the RS assumption is only valid for small values of the density of constraints a. 
When this parameter grows the solution space splits into a large number of pure states (clusters), which induces non- 
trivial correlations between the variables of the graphical model. The cavity method at the level of the first step of 
replica symmetry breaking (IRSB) is able to describe this situation [32]. Let us sketch some of its important features 
without entering into technical details. The assumption of the IRSB method is that the Bethe-Peierls approximation 
can still be used, but only for the probability measures restricted to a single pure state; to each such pure state is 
associated a solution of the BP equations (7). In order to handle the proliferation of pure states one introduces, on 
each directed edge of a given factor graph, a probability distribution, with respect to the choice of the pure states, 
of the corresponding BP messages. The IRSB equations linking these distributions on adjacent edges depend on the 
Parisi parameter m, which controls the relative importance of the pure states in the sampling of the corresponding 
BP messages, according to their internal sizes. A slightly more explicit interpretation of the IRSB equations has 
been given in [11, 30]. There it was shown that the distribution over pure states can be viewed as a distribution 
over 'far-away' boundary conditions, with a specific probability measure over these boundary conditions depending 
on TO. A special role is played by the value m = 1. In this case the 'far-away' boundary conditions are themselves 
drawn from the original Gibbs measure. The clustering transition, that is the appearance of a non-trivial solution 
of the IRSB equations at to = 1, can thus be related to the existence of long-range correlations of a particular type 
(so called point-to-set) in the Gibbs measure, as first discussed in [31]. These correlations measure the influence on 
a variable ig of fixing a subset B of variables, for instance B{i(),£) the set of variables at distance exactly £ from iq, 
to a reference value drawn from the equilibrium probability measure. Using the notation of conditional probability 
defined in Eq. (4) the typical long range point-to-set correlation reads 

Coo = lim lim IKcriolZBi^io/)) - M(c^io)l . (17) 

where the expectation is over the equilibrium configuration r and the factor graph model. In an unclustered regime 

the influence of the distant boundary vanishes. Coo = 0, while in the presence of clustering Coo > 0. In the latter case 
one can moreover compute the complexity of relevant clusters, that is the logarithm of the degeneracy of the clusters 
which bears the vast majority of the weights of the probability measures (at the leading exponential precision) by 
computing the difference in entropy of the conditional and original measures, M('lz_B(io i)) a-nd /i(-). The so-called 
condensation transition is signalled by a vanishing of this complexity. We shall come back in the next subsection on 
the technical details of the IRSB to = 1 computations, generalizing it to the case of partially decimated formulas. 

B. Cavity method for decimated CSP 

We turn now to the extension of the cavity method to partially decimated factor graphs. Our goal is to compute 
the residual entropy (5) by averaging the Bethe expression (10) with respect to the distribution of the conditional 
messages {riff,^^, i^a^i}- The randomness of these objects has several origins: (i) the choice of the factor graph (ii) the 
generation of the reference configuration r from the uniform measure /it (iii) the selection of the decimated variables 
in D, each independently with probability 6. The difficulty of this computation arises from the correlation between 
(i) and (ii), the measure /x being itself defined in terms of the factor graph. This dependence can however be handled 
within the context of the cavity method. Let us suppose indeed that the local properties of the original measure /i(-) 
are well described by the assumptions of replica symmetry, that is for a < Uc^. We can thus perform the computation 



the local properties are indeed of a RS type also in the clustered uncondensed regime a G [odiOc] [H]- 
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in the random tree model that corresponds locally to the random graph one. In this case the generation of the reference 
configuration r of a tree factor graph model can be done recursively, in a broadcasting way, thanks to the Markov 
property of such probability laws. The value of the root is first drawn with its marginal probability computed 
from the incoming messages according to (9). Then the configuration of the neighbors of io can be drawn conditioned 
on the value of Ti^, and the process can be iterated away from the root. Once the reference configuration r has been 
generated in such a way, the messages {rif^^, can be computed, their dependence on r arising from the condition 

VT^ai'^i) ^ ^oi.Ti for variables i in the decimated set D. At this point, for a given tree, reference configuration r and 
set D, each directed edge of the factor graph bears a pair of messages, for instance (I'a^i, i^flj) on the edge from 



constraint a to variable i. We can now define the random variable {v, v'^)e, which has the distribution of {va- 



n) 



when one takes into account the randomness in the choice of the tree, of the set D and of the reference configuration 
r, the latter being conditioned on Ti = r. Moreover the positive integer i indexes the depth of the random tree 
construction. We shall also introduce random variables having the same distribution as {rji^ajVI-Za)- ^^^r the sake of 
clarity in the following we actually introduce two versions of these random variables, {r], r]'^)i and {r], rj'^)i, the former 
being additionally conditioned on i ^ D. The equations defining these random variables by recurrence on i can now 
easily be written. 



iVi '7^)^ with probability 1 — 6 
{ri,6'^) otherwise 



(18) 



where we defined 5^ {a) = 5r,a , and 



where I is a Poisson random variable of parameter ak and (y\,vl] 
Finally one has 



.^D), (19) 
.,{vi,vj) are independent copies of {v,v'^)i. 



{u,v'')i+i = {f{r]i,...,r]k-i),f{vr 



where the {r]i,r]p) are independent copies of (??, ?7^')^> ^^'^ the configuration n, . . . ,Tk-i is drawn according to 



P[ri,...,rfe_i|T] = -V'(r,ri, 
z 



,Tk-l)r]l{Tl) ■ ■ - Vk-liTk-l) ■ 



(20) 



(21) 



Let us emphasize that the function ij; used in the broadcasting generation of ri,...,rfe_i is the same (random) 
constraint function as the one used to compute / in (20), and that the messages rjj are the same in (20) and (21). 
We shall discuss a numerical procedure for solving these equations on the example of satisfiability formulas in Sec. V. 
The prediction of the cavity method for the average residual entropy (5) can finally be expressed in terms of these 
random variables. 



afc(l-6')E 



+ aE 



E 



-ipjTi, Tk)r]i{Ti) . . . rjkjTk) 
J:r'....r'^Kri...,r'MTi)...rjk{r'k) 



+ (1-6')E 



1^1 (r) . . . ui (r) 



In 



\ai,...,ak 



(22) 



where as before I is a Poisson random variable of parameter ak and the various {rjijri^*) are independent copies of the 
corresponding random variables, with the i ^ oo limit kept understood. 

We shall now discuss an important issue that we kept under silence, namely the definition of the initial condition 
(rj, ri'^)f=o- Let us first make the connection between the computation just presented and the m — 1 IRSB description 

of non-decimated formulas, that is consider for a while the case 9 = 0. We shall call Iq the initial condition {rj, rf)i=Q = 
(rj, jj), with T] a random variable solution of the RS fixed point equation (16). It is easy to show that with such an 

initial condition {r],rf)i = (r?,r/) for all values of ^, and that (22) reduces to the RS prediction (15) for the average 

entropy. If on the contrary one considers the initial condition Ii defined by {r],ri'^)e=Q = {rj,S'^) and iterates the 
recursion equations (18,19,20), one reproduces the m = 1 computations in the form of [30, 31]. This should not be 
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a surprise : the interpretation of the m = 1 IRSB formalism presented above was precisely the study of the effect of 
a far away boundary, which is implemented here in this initial condition and in the limit ^ — > oo. In an unclustered 
phase the two initial conditions lead to the same random variable (77, 77) in the large £ limit, the effect of the far 
away boundary vanishes at long distance and the correlation function (17) is equal to 0. If on the contrary the two 
initial conditions do not lead to the same limit when £ diverges the correlation function remains positive, signaling 
the presence of clustering in the solution space. In that case the value of (22) computed with the initial condition 7i 
is the internal entropy of the relevant clusters, the difference between (15) and (22) is ascribed to the complexity, the 
degeneracy of the relevant clusters. We now come back to the decimated case 6 > 0. The two initial conditions are 
still relevant and have the same interpretation with the measure /i(-) replaced by /i(-|T^). Suppose indeed that the 
point-to-set correlation criterion were to be tested for the typical properties of I^{-\tjj). Then one would compute the 
generalization of (17), 

Coo{0) = lim lim E X! l^(^'o lrDus(io,«)) " /^(^iotn)! . (23) 

where the expectation would include an average over the set D of 9N variables. This is precisely what is realized by 
the two initial conditions followed by the recursion relations (18,19,20) for this value of 9. Again we shall conclude 

on the absence of clustering in iJ.{-\Tjy) when the two initial conditions have the same ^ — > 00 limit, and obtain the 
typical complexity of the clusters from the difference in the values of (22) for the two different initial conditions. 



C. The cavity computation of the average number of logically implied variables 

Another quantity which can be interesting to compute is the amount of logical implications, in the UCP/WP sense 
explained in Sec. II D, induced by the choice of the partial reference solution T]j. Let us define as 



(f){6)=6+ lim —EirET-ED[nb. directly implied variables] 

iV— >(x) TV 



(24) 



the average fraction of variables which have been explicitly assigned or which can be logically deduced from these 
assignments. A motivation for the study of this quantity, as explained in [22], is that ^ — 1 is the average number of 
the newly implied variables as a single assignment step is performed. The size of this set of variables, and in particular 
its possible divergence with the formula size when ^{6) is discontinuous, should thus be a measure of the sensitivity 
of the decimation procedure with respects to small errors made by the BP version of the procedure with respects to 
the ideal one. 

The computation of (p{d) can be performed, within the RS assumptions on the local structure of the probability 
law /i(-), in the locally equivalent random tree model. For a generic CSP one can reproduce the reasoning above, 
replacing the conditional messages r/-^^ by their WP counterparts n~^^ according to the projection defined in (11). 
This leads to the similar definition of sequences of random pairs of variables, (77, n"^)^, {ri,n'^)i and {ri,X)'^)e, which 
obeys recursion equations analogous to Eqs. (18,19,20), 




with probability 1 
otherwise 



e 



(25) 



iu,x>^)i+i = (/(r?i, . . . ,%_i),f(KlS . . .,nl-i)) 
The function (f){9) can then be computed as 



(26) 



(j){9) = E 



(27) 



in the £ ^ 00 limit. This form of the computation is valid for any CSP; in some particular cases one can however 
devise a more efficient (and numerically more precise) formulation, using the specific form of the constraints rules to 
replace the WP message in the second entry of the random pair by a probability of implication. We refer the reader 
to [22] for further details on this computation for random satisfiability formulas. 
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IV. APPLICATION TO THE XORSAT ENSEMBLE 



We apply in this section the general formalism developed above to the ensemble of xor-satisfiability formulas. As we 
shall see great simplifications occur for this simple model, and the study of both the decimated ensemble of random 
formulas and of the BP guided decimation algorithm can in fact be performed with simpler methods [28, 29, 33]. It is 
however an instructive toy model to begin with before handling the more involved case of satisfiability formulas, and 
has deep connections with information theory, in particular with the Low Density Parity Check codes. The analysis 
of the "Maxwell decoder" in [34] is actually very tightly related with the content of this section. 



A. BP equations 

The constraints of a xorsat CSP have been defined in Eq. (1). The set of solutions of such a CSP exhibits some 
simplifying symmetries, even in the decimated case where some variables are fixed to a given value. It is indeed easy 
to realize that a non-decimated variable is either fixed to +1 in all the solutions, or fixed to -1, otherwise it takes 
the value +1 in exactly half of the solutions and -1 in the other half. Consequently the BP messages Va-*i and rji^a 
can be restricted to a set of three possible types, that we shall encode with three-valued numbers Ua^i and hi^a 
according to the following correspondence: 



^ 4^ Ua^i = 



2 

With these notations the BP equations (8) can be rewritten as 



^/ii^„ = -l . (28) 
■9 hi — >.Q — 



{0 if ufc^, = V6 G 9z \ a 
-1-1 \i3b e di\a with u^^i = 1 . (29) 
— 1 \i 3b <E di\a with = — 1 

The second expression is well defined as long as no contradictions are detected, which means that the conditions in 
the last two lines are not fulfilled simultaneously. The boundary condition for a decimated variable i € D is hj^^ = n 
for all neighboring interactions a G di. 

Due to the symmetry of the model the BP equations (29) can actually be regarded as WP equations that express 
the simplifications of the Unit Propagation rule. The first equation reflects the fact that a constraint imposes the 
value of one of its variables if and only if the k — 1 other variables arc fixed (either by decimation or by propagation 
of logical implications), the second means that a variable is fixed as soon as one of its neighboring clauses imposes its 
value. 



B. The cavity method computations 

Following the general formalism introduced in Sec. Ill for the treatment of decimated CSPs and the parametrization 
of the BP messages in terms of u and h, we have to find the distributions of the random variables {h, h'^)i, {u, u^)i and 
{h,h^)i for T = ±1. These are the solutions of Eqs. (18,19,20) specialized to the functions /, g and tjj of the xorsat 
model. The relevant solution of this equation takes a particularly simple form which allows for an analytic solution. 
The results of [28, 29] imply indeed that the set of solutions of a non-decimated fc-xorsat formulas is described by the 
trivial solution of the RS equation, u = h = 0. Moreover the value of the conditional message h'^ cannot be equal to 
— T : by definition h'^ describes the measure where the reference solution has been drawn conditional on its value at 
the root being t, whereas h'^ = — r would mean that all solutions compatible with the values of the reference on some 
subset of variables D have — r at the root, which is in contradiction with the hypothesis. In consequence the solution 
of (18,19,20) can be looked for under the form: 

fi. ^.T\ d J (0,0) with probability 1 - a;<> ^ d J (0,0) w.p.l-ye rT\ d J (0,0) w.p.l-(j)e 

(0, T) with probability xe (0, r) w.p. ye (0, r) w.p. (pe 



(30) 

6* -h (1 - 9)xe , xe = l- exp[-akye] , ye+i = 4>\~^ , (31) 



Inserting these forms in (18,19,20) leads to the following equations : 

...1 . .^^ ^ 
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FIG. 1: Illustration of the recurrence equation (32) for 3-xorsat. Left panel: a — 0.5 < a*(fc = 3) = 2/3, right panel: 
a — 0.8 > a*. On each of the plot the three curves are for different values of 9 (growing from bottom to top). 




FIG. 2: Fixed point solution(s) of Eq. (32) for fc = 3. Left panel: q = 0.5 < a.(fe = 3) = 2/3, right panel: a = 0.8 > a.. The 
solid line is (t>{0), the dashed line in the right panel is "4^ (6), the largest fixed point solution of Eq. (32). 



which can be closed under a single recursion equation on (pi, 

=^^ + (1-0) (l-e-"'^^'") . (32) 

The fixed point equation (pe+i = (j)£ has between one and three distinct solutions on [0, 1], depending on the values 
of a and 9 (examples of the various situations are provided on Fig. 1). A quick analysis of the equation shows that 
for a < a*, with 



1 /k^l'^''^^ 



k \k-2 



(33) 



the equation (32) admits a single solution for all values of 6. If on the contrary a > a^,, there exists a range of 9, 
denoted [9-{a), 9+{a)], where Eq. (32) admits three solutions in [0, 1]. In that case we shall call (j){9) (resp. ip{9)) the 
smallest (resp. the largest) of these three solutions. Some examples of these curves are shown in Fig. 2, and the lines 
9±{a) are displayed in Fig. 3. 

The expression of uj given in Eq. (22) can be computed using the ansatz (30), 

to = {\ii2)[ak{l - 9){1 - xy) - a(l - 0'=) - (1 - 9){{ak){l - y) - exp[-aA:?/])] , (34) 

where the limit ^ — > cx3 is kept understood. Using the relations between a;, y and </> stated in (31), one can express 
this residual entropy in terms of (p, 

Qicp) = {lii2)[l - (P - a + ak{l ~ + acj)''] . (35) 

When a < a* the fixed point solution of (32) is unique, hence w(0) = u){(p{9)) is a smoothly decreasing function of 
9, as plotted on the left panel of Fig. 4. For larger values of a, i.e. a > a*, we have seen above that there exists 
a range of parameter 9 £ [9^{a),9+{a)] where two solutions of (32), (p{9) and ip{9), coexists. On the right panel of 
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T 



0.2 - 




FIG. 3: Phase diagram in the (a, 9) plane for the ensemble of 3-xorsat decimated random formulas, the three critical lines meet 
at a.(fc = 3) = 2/3. 




FIG. 4: Residual entropy io{9) for 3-xorsat. Left panel: a — 0.5, right panel: a — 0.8. The solid line is the true entropy 
i^{0) = msLx[u){(f){6)),u){tf}{9))], the dashed lines are the two irrelevant branches, mm[Q{(j}{6)),Q{tjj{9))]. 

Fig. 4 one can see that the two branches of the entropy, uj{(p{d)) and tj{ip{9)) cross each other at an intermediate value 
Oc{a) € [9-{a), 9+{a)], which is also plotted in function of a in the phase diagram Fig. 3. It is natural (and we shall 
argue in the following that it is the correct choice) to consider that in the region of coexistence the relevant branch is 
the one leading to the largest entropy, uj{9) = max[tD((/)(0)), £('0(0))], which thus exhibits a discontinuity in its slope 
when 9 crosses 9c- 

A direct justification of this choice will be given in the next subsection, here we argue in its favour on the basis of 
the cavity method. The two boundary conditions discussed in Sec. IIIB corresponds to (fii^o = (/q) and ipi^o = 1 
(/i). When the fixed point solution of (32) is unique both initial conditions lead to the same limit (/)(0) in the large 
£ limit, which leads to the conclusion that there is no clustering in the solution space of the decimated formula for 
these values of a and 9. 

In the region of coexistence these two initial conditions yield, respectively, 4>i — > (j){9) and (f>i — > "0(0) as i diverges. 
One is thus led to assign the difference u){<j){9)) — Lj{ip{6)) to the complexity of the decimated formula, that is the 
contribution of the entropy due to the presence of clusters in the measure ^{-iDgN). This interpretation is valid only 
when the complexity is positive, that is in the range [9^ ,9c]- A condensation transition occurs when the threshold 9c 
is crossed. In the region [9c, 0+] only a subextensive number of clusters are relevant, and the total entropy is equal to 
their internal entropy. The latter being given by the thermodynamic computation with the initial condition Ji, one 
concludes that lu{9) = Lo{'tp{9)) for 9 e [9c, 9+]. 

C. A more direct computation and its interpretation 

It is instructive to rederive the above results on xor-satisfiability decimated formulas by more direct means. Let us 
first show that for this computation one can assume that the formula is unfrustrated (i.e. Ja = 1 for all constraints) 
and that in the reference solution r all variables are fixed to = 1. Suppose indeed that the formula has been 
generated with random Jq = ±1. As we have conditioned the ensemble of formulas on satisfiable ones, there is at 
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least one solution, call it By the gauge transformation cr^ ^ (j[ = (Tiaf^' , the solutions a of the original problem 
are in bijection with the solutions a' of the unfrustrated model. Consider furthermore a reference solution r of the 
unfrustrated model and a set D of variables, such that the decimated problem to solve reads 

]J = 1 Va , a'i = Ti \/i G D . (36) 

Applying now the gauge transformation cr- a'/ = cr^Ti, noting that r is a solution of the unfrustrated model, one 
reduces the problem to 

JJ CT^ = 1 Va , a'l = 1 Vi e £> . (37) 

Having get rid of the signs in the constraints and in the reference solution, the size and the structure of the set of 
solutions of the decimated problem can be deduced from the underlying hypcrgraph of constraints [28, 29, 33]. 

The initial hypergraph is drawn uniformly with aN clauses of length k among N variables. A fraction 9 of the 
variables are fixed to +1, and can thus be eliminated from the constraints which are reduced in size. Unit Clause 
Propagation can then be run to propagate these simplifications. The details of this computation are deferred to 
Appendix A, wo only quote here the results. When UCP stops, there arc A^(l — 4){9)) variables unassigned, with 
(/)(^) the smallest fixed point solution of (32). The simplified formula contains constraints of all lengths k € [2,A;], 
more precisely there are aA^(^)(l — 0)'^^'°"'' clauses of length k. The unassigned variables have a Poisson degree 
distribution with average ak{l — (f)^^^). 

At this point the structure of the solutions of this reduced formula can be studied with the leaf removal algorithm [28, 
29]. The details are again deferred to Appendix A. One finds that the presence of an extensive 2-core is equivalent 
to Eq. (32) admitting more than one solution, i.e. if a > a* and in the interval 6 G [^?_,^?+]. If this is the case, the 
larger solution ■0 gives the fraction of the variables which are either fixed at the end of UCP, or in the backbone of 
the UCP-reduced formula. The difference between the number of variables and the number of clauses in the 2-core is 
N{uj((f)) — oj(tjj))/ {ln2). In the interval 6 E [6^, 9c] this quantity is positive, hence it is interpreted as the entropy of 
the number of solutions of the 2-core, i.e. the complexity of the reduced formula. In [6c, ^+] the negative complexity 
is due to rare events, typically the 2-core only contains a sub-exponential number of solutions, hence the discontinuity 
in slope of uj{0) at this condensation transition 0c. For a > (the usual dynamical threshold) the original formula 
already has a 2-core, hence = in this case. Similarly 9c ^ for a > ac, the satisfiability transition of the standard 
ensemble. 

Let us remark that the density of clauses of length 2 in the UCP-reduced formula reads (l/2)a'fc(fc — 1)(1 — <^)^''~^. 
When reaches 6+ from below this density reaches 1 /2 and thus the sub-formula made of length 2 clauses percolates. 
Indeed 6*+ is the point of disappearance of the solution (p{6) from the Eq. (32), hence by the implicit function theorem 
the derivatives with respect to of the two sides of Eq. (32) are equal at that point. 

D. Numerical experiments on BP guided decimation 

We present in this section the results of numerical experiments performed with the BP guided decimation algorithm. 
According to the definitions given in the general setting, these experiments consisted in generating a random xorsat 
formula (with Ja = ±1 with probability one half) and assigning step by step the value of the variables. The variables 
were assigned in an uniformly random order. Each time a variable is assigned the BP equations (29) are iterated until 
convergence is reached or a contradiction is detected (that is a variable i receives at least two contradicting messages 
u = +1 and u = —1 from the neighboring clauses). As long as no contradiction is found, the value of the next variable 
to be assigned is drawn according to the BP estimation of its marginal probability. In this simple model this BP 
marginal is either completely unbiased (when all incoming messages u from the neighboring clauses vanish), in which 
case the value of the variable is ±1 with equal probability, or completely biased, and the assignment is nothing more 
than the validation of an implication of previous choices. A run of this algorithm is successful if it assigns the value of 
the A'' variables without encountering any contradiction, the configuration obtained at the end of the process is then 
a solution of the formula. 

In the left panel of Fig. 5 we present the probability of successful runs, with respect to the choice of the formula 
and to the randomness in the course of the run (order of the variables and free choices for unbiased marginals). It 
goes to a finite value in the thermodynamic limit (which can be computed analytically, see below) for a < a* , and to 
for a > a* . A further piece of information is given in the right part of Fig. 5 about the number of steps performed 
by the algorithm (i.e. the number of variables assigned) before it stops. Let us call this random variable Thait and the 
associated fraction Ohait = The,n/N. We plotted the mean and variance (represented by error bars) of ^hait) computed 
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FIG. 5: Left panel: probability of success of BP guided decimation on 3-xorsat random formulas, the solid line is the analytical 
prediction [36, 37] in the infinite size limit (cf. Eq.(39)), which vanishes for a > q«, the symbols are the results of numerical 
simulations on 800 formulas of size N = 2 ■ 10"* variables for each value of a. Right panel: the solid line is the curve 0+{oi), 
symbols indicate the mean and variance (over unsuccesful runs) of the fraction of variables assigned before a contradiction is 
detected, from numerical simulations on 800 formulas of size N — 2 ■ variables for each value of a. 



only on unsuccesful runs. For a < a, one finds that ^hait converges in the thermodynamic limit to a non trivial 
random variable. On the contrary in the regime where the algorithm fails w.h.p. (i.e. for a > a*) the variance of ^hait 
vanishes at large N (this result was obtained by performing the simulations at various sizes, which is not shown on 
the plot). In this case ^hait concentrates around its mean, which is found to coincide with the function 0+{a) defined 
above (see in particular the inset of the right panel of Fig. 5) . 

These numerical observations can now be interpreted in the light of the analytical computations performed above, 
which were mimicking the decimation process using the perfect marginals instead of the BP estimation. The threshold 
a, above which the BP guided decimation algorithm fails w.h.p. coincides with the point where the evolution in the 
(a, 6) phase diagram has to cross the transition lines drawn on Fig. 3, and in particular to penetrate the region 
[9c{ct),9+{a)] where the 2-core of the residual formula only admits a sub-exponential number of solutions when the 
perfect marginals are used for the decimation. In this case the algorithm is naturally very sensitive to the small 
mistakes made by the BP algorithm, which destroy the few solutions of the 2-core. The fact that the residual formula 
is no longer satisfiable remains however unnoticed until a fraction 0+{a) of the variables have been assigned. At 
this point the fraction (/)(0) of decimated and logically implied variables has a finite discontinuity (see right panel of 
Fig. 2), which means that the assignment of a few new variables triggers an avalanche of implications of extensive 
size. The extensive subgraph of newly implied variables will contain implication cycles which, if some mistakes have 
been done in the previous assignment steps, will lead to contradictions. More quantitatively, it was underlined above 
that 0+ (a) marked the percolation of the subformula of length 2 clauses which supports the propagation of the logical 
implications. 

The simplifying symmetries of the xor-satisfiability formulas are such that BP guided decimation is here almost 
equivalent to the Unit Clause Propagation algorithm with random heuristic (and also to the random pivoting Gaussian 
elimination algorithm of [35]). The only slight difference between the two lies in the order in which the variables are 
treated, the logical implications being propagated as soon as they are detected in UCP. In the BP description of the 
algorithm the implication is effectively taken into account by the propagation of the messages, even if the variable 
is not explicitely declared as assigned. The behavior of UCP on xor-satisfiability formulas has been studied in [33], 
the results we just found are in agreement with the ones of this paper. In particular the phase diagram in Fig. 3 
reproduces the left panel of Fig. 3 in [33] , apart from the difference in the definition of the vertical time axis explained 
above. The equivalence with UCP allows also the computation of the probability of success in the thermodynamic 
limit for a < a^. A detailed derivation for satisfiability formulas can be found in [36, 37], we state here the result 
without proof. 



-^succ — exp 



dt 



fit) 



2 1 



4(l-i)l-/(t) 



with f{t)^ak{k-l)t''-^{l-t) 



(38) 



For k — 3 this expression can be further simplified. 



-^succ — exp 



3a 

T 



1 



2 ./^^^ 

V a 



: arctan 



(39) 



This function is plotted as a solid line in the left panel of Fig. 5 and agrees with the results of the numerical experiments. 
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V. APPLICATION TO THE SAT ENSEMBLE 



We turn in this section to the case of random satisfiabihty formulas. We shall first apply the analytical cavity 
formalism to this particular model, then present the results of numerical experiments with the BP guided decimation 
algorithm and confront the two approaches. 



A. BP equations 

Let us begin by cxpliciting the BP equations (8) for the satisfiability constraints defined in (2). As the variables 
CTj are binary the messages {va^iicTi), rii^a{(^i)} can be parametrized with a single real for each, that we shall denote 
{ua^i, hi^a}, under the form: 

, , 1 - Jf CTj tanh Mo^i 1 - Jfo-jtanh/ij^a 

Ua^i{(Ti) = , Vi^a{<Ti) = . (40) 

As we included the coupling contant Jf in these definitions a positive value of, for instance hi^a, does not indicate a 
bias of (Tj towards the value +1 in the absence of clause o, but rather towards the value — Jf that does satisfy a. The 
message sent by a clause to one of its variables is then found to be 

Ua^i = f{{hj^a}jeaa\i) , /(/ii,---,/ife-i) = -^In 1^1 - n — tanhfa. j ^ ^^^^ 

To give the explicit form of the other set of BP equations it is advisable to introduce some further definitions. We 
shall call 9+i (resp. d-i) the set of clauses in di which are satisfied by cTj = +1 (rcsp. o-j = —1), that is d„i = {a e 
di\Jf = — cr}. Moreover wc let d+i{a) (rcsp. d-i{a)) denote the set of clauses vn di\a agreeing (resp. disagreeing) 
with a on the value i should take. In formulae, d+i{a) = {b€ di\ a\J^ = Jf }, d-i{a) = {b € di\Jf = - Jf }. With 
these notations the message sent by a variable to a clause reads 

hi^a = ^ Ub^i - ^ Ub^i , (42) 

6e9+i(a) 6e9_i(o) 

while the marginal probability of a variable i reads from (9): 

1 + CT,tanh(i/,) t7 /.on 

Mi(o-i) = ^ , Hi= Ua^i - Ua^i . (43) 

Finally when a subset of variables D is fixed to a reference configuration Tjj, the BP equations (41,42) are comple- 
mented with the boundary condition hf^^ = — JfrjOO when i G D. Note that in all the numerical implementations of 
these equations wc keep the hyperbolic tangent of the messages u and h which are free from this apparent singularity 
in the definition of h around a decimated variable. 



B. The usual cavity method 



The replica symmetric version of the cavity method for non-decimated random satisfiability formulas, following 
the general formalism recalled in Sec. Ill A, corresponds to a probabilistic interpretation of the BP equations (41,42) 
applied to random factor graphs. As the cardinality of di \ a converges to a Poisson random variable of parameter 
ak and the sign J? in the constraints definition are ±1 with probability one half, it follows that |9+i(o)| and \d-i{a)\ 
converge to two independent Poisson random variables with parameter ak/2, denoted and Z_ below. One has thus 
to look for the solution of the distributional equations corresponding to (41,42) for the random variables h and u: 

1+ I- 

h = '^Ui-'^Vi, u = f{hi,...,hk-i) . (44) 

i=l j=l 

In these equations hi,. . . , hk-i are independent copies of h, ui, . . . , ui^ and vi,...,vi_ are independent copies of u. 

A numerical determination of the fixed point distributions solutions of these equations can be achieved by the 
population dynamics method, revivified in this context by [32]. This consists in representing the random variable 
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u (resp. h) by a sample of Af elements ui, . . . ,uj\f (rcsp. hi, . . . , h^). The sample representing h is initialized 
arbitrarily, for instance hj = for all j £ [1,.A/], then the two samples are updated alternatively as follows. A new 
sample representing u is obtained from the representation of h by, independently for each j e ]: 

• drawing k — 1 indices ii,. . . , ik-i independently, uniformly in [l,Af] 

• setting Uj = fihi^,. ■ ■ 

Subsequently the sample of h is updated, for each j , by: 

• drawing and 1-, two Poisson random variables of mean afc/2 

• drawing Z+ + Z_ indices if , . . . , , . . . , independently, uniformly in [1, A/] 

• setting hj = ELti - En=i «i- 

The replica symmetric description of the solution space of random satisfiability formulas (first obtained with replica 

computations in [38]) is only valid for low enough values of a. The IRSB analysis at m = 1. described in generic 
terms in Sec. Ill A, has been performed on the satisfiability ensemble in [11, 30]. For fc > 4 one finds a clustering 
transition at ad and a condensation one at a.c , before the satisfiability transition as determined in [7 9] (for instance 
Q!d ~ 9.38, ttc « 9.55 and ag ~ 9.93 for k = 4). In the intermediate regime [ad,Q;c] the complexity of the relevant 
clusters is positive and vanishes at ac, a point beyond which most of the solutions are contained in a sub-exponential 
number of clusters. The value fc = 3 happens to be a particular case, for which the intermediate regime with a positive 
complexity is absent, that we shall not consider in the following. 



C. The computation of ui{d) 

Let us now apply the formalism of Sec. IIIB to ensemble of decimated random satisfiability formulas. Using the 
parametrization (40) of the messages the random variables {r],rj'^)e, {ri,rf)i and becomes random pairs of 

reals, respectively (/i, h'^)t-, {h, h'^)e and {u, u^)t, for r = ±1. In the same spirit as we include the coupling constants in 
the definitions (40) of the fields u and h, we also "gauge" the definition of these random variables such that u+ (resp. 
u~) corresponds to the message wff^^j where Zd is drawn conditional on Tj satisfying (resp. not satisfying) clause a. 
The recursion equations (18,19,20) then reads 

iuTr\ d \{h,h'^)i with probability 1 - 6* lu ur\ <^ r -r 1 iAt^\ 

= {h, TOO) Otherwise ' ('^' )^ = E " E E - E ' (^5) 

where l± are two independent Poisson random variables of parameter afc/2 and the («», uJ) and {v, vj) are independent 
copies of {u,u^)i. Finally Eq. (20) translates into 

(w,u-),+i 4 (/(/ii, . . .,hk-i),f(hl\. . . ,C"i )) ' (46) 

where the configuration of the variables n, . . . , Tk-i is drawn with one of the two following probability laws according 
to the value of r, 

Prob[ri, . . . , T,_i[r = +, fti, . . . , hu-i\ = TT , (47) 

i=i ^ 

or 

A: — 1 

r. ur I ^ 1 1 - II(ti = • • • = Tfe-i = -1) -pr 1 + ritanh/ii 
Prob[Ti,...,Tfc_i|r = -,/ii,...,/ifc_i] = , ^fc-i i.tanh^ - 11 2 • ^^^^ 

lli=l 2 i=l 

The fields hi are the same for the computation of u in (46) and in the probability law of the t^'s expressed in (47,48). 
The two initial conditions correspond to (/i, h'^)i=o = {h, h) for the initialization called /q, and {h, h'^)e=o = {h, too) 
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for Ii. Finally the average entropy of the decimated random formulas reads from (22) 

L0= - ak{l -e)E 



y-\ 1 + r tanh('U + h) ^ f ^ tanh u'^ tanh h'^ 



E 

(l-6')E 



1 - I(ri 



1 - nu 



k 1— tanh 



= -1) -pr 1 + Ti tanh /? ., , / TT 1 ~ tanh /ip 
K il 2 11 2 



.Tl,---,Tk ' 111=1 2 1=1 

1 + Ttanh fe^li - ^^"i t;, I / ^ i + atanhwj -X 1 - atanht;"" 

E ^iMEn — 2 — 

<T 1=1 i=l 



(49) 



A numerical determination of the distribution of the random variables {h,h'^)e and {u,u'^)e can be performed 
by a population dynamics algorithm. Wc introduce two population of Af triplets of reals, {{hi, hf , h~)}-^^ and 
{{ui, ,u~)}fLi, such that, for instance, the empirical distribution of {hi, hf) after £ steps of the algorithm is a good 
approximation of the random variable {h, h'^)e- In the initialization step of the algorithm the hi's are drawn according 
to the fixed point solution of Eq. (44), which is obtained from a preliminary RS population dynamics procedure. For 
the initial condition Iq (resp. h) one sets = h^ = hi (resp. hf = ±00) for all i G Then the following 

two kind of updates are iterated i times. A new sample of {(uj,u^,u^)}^i is obtained by, independently for each 

• drawing k — 1 indices ii, . . . , i^-i independently, uniformly in [1, A/] 

• sotting Uj = f{hi^ hi^_J 

• independently for n = 1 , . . . , fc — 1 



— with probability 6 set /i+ = +00 and = —00 

— otherwise set = h^ , and h^ = h^ 

• generating a configuration ti, . . . , Tk-i from the law Prob[Ti, . . 



,Tk-l\T 



,ht^_^] defined in Eq.(47) 



• setting u 



fih? 



'fe-i 



• generating a configuration ti, . . . , Tk-i from the law Prob[ri, . . . , Tk-i |r = — , /lij , . . . , hii,_^] defined in Eq.(48) 

• setting uJ = f{K[\. . . ,hl^Si) 

Subsequently the sample of {{hi, hf , h^)}^i is updated, for each j, by: 

• drawing and two Poisson random variables of mean ak/2 

• drawing + Z_ indices i'^ , . . . , . . . independently, uniformly in [1, A/"] 



setting hj = Y^n^l Ui+ - En=l ' = Sn=l ~ Sn=l ' 



and = J2n=i V - E„=i 



Aftcr a large number of these iterations has been performed the determination of the residual entropy (49) is easily 
obtained: the expectation values can be interpreted as empirical averages over the population. 

We have implemented this numerical procedure and performed the computation for various values of a and 9. The 
results for /c = 4 are as follows. For small enough values of a the large i limit of the recursion relations (45,46) is found 
to be independent of the initial condition Iq or Ii used, and the residual entropy density lj{9) is a smoothly decreasing 
fimction. This quantity is plotted for a = 8.8 on the left panel of Fig. 6. For larger values of a there appears a regime 
9 £ [9 -{a), 9+ {a)] in which the two initial conditions leads to different fixed point solutions of (45,46), signaling the 
presence of non-trivial long range point- to-set correlations in the decimated formula. The two branches of uj{9) are 
plotted in the right panel of Fig. 6 for a = 9.3, and are found to cross each other at 9c{a) E [9^{a),9-^-{a)]. For 
9 € [9-{a),9c{a)] the branch with the highest value of w corresponds to the Iq initialization, the situation being 
reversed for 9 G [9c{<y),9+{a)]. As explained on the simpler xorsat example, we interpret these results as following 
from the existence of a positive complexity of relevant clusters in the regime , 9c\ ■ In this case the highest branch of 
Lij{9) is the total entropy of the decimated formula, while the difference between the two branches is its complexity. On 
the contrary for [9c, 9+] the upper branch is the only relevant one, the total entropy is dominated by the subexponential 
number of clusters around a typical reference solution t. The three critical lines are displayed in the (a, 9) of Fig. 7, 
which also shows that, as follows from their definitions, 9-{a) (resp. 9c{a)) reaches the horizontal axis = at the 
usual dynamic transition ad (resp. condensation threshold etc). We estimated the location of the critical point where 
9± and 9c merge to be a* = 9.05, ^* = 0.045, by interpolation of the results obtained for values of a slightly larger. 
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FIG. 6: Residual entropy io{9) of 4-sat random formulas. Left panel: a = 8.8, a;(6) is smoothly decreasing. Right panel: 
a = 9.3, the inset is a zoom around the singular point of u;{6). 
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FIG. 7: Phase diagram of 4-sat random formulas in the {a, 6) plane, from bottom to top 6 -{a), 6c{a) and 9+{a). Symbols 
result from the population dynamics algorithm, lines are guides to the eye. 



D. The computation of 



We proceed now with the computation of the fraction of logically implied variables, following the lines sketched in 
Sec. IIIC (the same results were presented in [22] with a slightly different formulation). 

Let us first discuss the Warning Propagation equations for satisfiability formulas. According to the projection 
equations (11) one has to identify the situations in which a single value of a variable Ci is allowed by a BP message. 
For the messages sent by a clause to a variable this can only happen when the variable is forced to satisfy the 
clause, i.e. Va^i{(Ji) = 5ai-jf, in which case we define the WP message to be Ua-»i — 1, otherwise UQ_i = 0. A 
message ?7i_,a sent from a variable to a clause can allow both values of variable Ui, or force it to the value satisfying 
a {r]i^a{<^i) — ^cTi-Jf), or to the value disastifying it {r]i^a{<^i) — Sai,j^)- It is only the latter case that shall be 
propagated by the WP equations, we thus affect the value t)i^a = to the first two situations and t)i^a = 1 to the 
latter. The WP equations read with these definitions: 



n 



(50) 



For a variable i E D the boundary condition reads [)i^a = i{Ti = J°). 

In order to compute the average fraction of logically implied variables, within the assumptions of the RS cavity 
method on the local description of the uniform probability measure /i(-)) we introduce the sequences of random 
variables {h,i)'^)e, (/i, f)'^)^ and {u,u^)e as defined in Sec. IIIC. It turns out that not all the values of r have to be 
considered. Consider for instance the random variable (u, u'^)^. Its distribution is by definition the one of (ua^i, u^f^J, 
in the random tree model of depth i, rooted at variable i which appears solely in the clause a. Depending on r the 
reference configuration r is drawn conditional on either satisfying (if r = +1) or not satisfying (r = —1) the 
constraint a. In the latter case u~'^^ is necessarily equal to 0: at least one of the variables in 9a \ i must satisfy a 
in r, and this variable cannot be forced to its opposite value by r^,. We can hence restrict our attention to (m,u+)^. 
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which is found to obey 

r +^ d , . r- . ^ d Jl with probabihty rii-i ^^^f^ ^r,. 

(n,u+W, = (/(/.„..., /^,_0,C[)i...f)._,), C=|o ^^j^^^^.^^ . (51) 

The probability that the random variable C equals 1 is indeed the probability that, conditional on n satisfying the 

root clause a, the configuration of the fc — 1 other variables in 9a \ i arc drawn to the values unsatisfying a. 

For similar reasons the right hand side of this equation does not depend on (ft, f)+) and we can complete this 
equation with the recursion on (ft, f)~) and (ft, ()~), which read 

where as usual l± arc two Poisson random variables of parameter ak/2 and the («;. u^) and ((,',;. X)f ) arc independent 
copies of (u,u+)£. Finally the average fraction of either decimated or directly implied variables can be obtained as 

0(61) = E[(l - tanh h)i)-] . (53) 

These recursion equations can be solved numerically using the same kind of population dynamics as explained above, 
updating in turns populations of pairs {{hi,l)^)}^i and {(w^, u~)}^]^. The two kind of initial conditions already 

discussed correspond here to (ft, t)~)i=o = (ft,0) for Iq, and (ft, l)^)£=o = {h, 1) for Ii. 

This numerical resolution leads to the following results for fc = 4. At small enough values of a the two initial 
conditions lead to the same large £ limit, the function (l){9) is smoothly increasing (see left panel of Fig. 8). For 
larger values of a there exists a range of parameter [6'_{a), 9'^{a)] where the quantity (53), computed from the initial 
condition /i, is strictly greater than the one reached from Iq. In this coexistence regime we shall call ijj{0), in analogy 
with the notations used for the xorsat model, the upper branch obtained from Ji , see for instance the right panel of 
Fig. 8. The function (p (resp. tp) is discontinuous at 6',^ (resp. 6'_). These two thresholds are the two upper curves in 
the phase diagram of Fig. 9, which also contains for comparison a repetition of the phase diagram of Fig. 7. The two 
regimes for the behaviour of (^(0) are separated by the value a'^ « 8.05. 

At this point the reader might be puzzled by the apparent contradiction between these results and those of the 
previous subsection. Consider indeed some parameters a > a'^ and 6 £ [6'_{a),6'_^{a)]. We claimed in the previous 
subsection that the large i limit of the random variable {h,h'^)t was independent of the initial condition in t — Q, 
whereas we just found that (ft, f)")^ does depend on it. As the latter variable is a projection of the former, this 
statement is at first sight paradoxical. This apparent contradiction can however be resolved by a closer inspection of 
the relationship between the two random variables. One has indeed 

(ft, ()-)£= lim(ft,I(tanhft- < -1+e))^ , (54) 

£— »0 

while the two apparently contradictory statements are 



Hm (ft, h-)['' 4 Km (ft, ft-)^i , Hm (ft, f)")? ^ /im (ft, • (55) 



The resolution of the paradox relies on the non-commutativity of the limits oo and £ ^ 0. More explicitly, under 
the initialization /q there is a positive probability for a field tanhft" to have -1 as its large i limit, yet remaining 
strictly superior to -1 as long as (. is finite. If the limit £ ^ is taken before (. oo these fields do not participate to 
k)~ , which is thus found to be smaller in the initialization Iq with respect to Ji. Yet if the limit £ ^ oo is performed 
first this positive fraction of the fields tanhft" (with initialization Iq) reach their limit —1, hence making possible the 
first statement of Eq. (55) . We checked explictly this phenomenon by constructing a coupling of the two initializations 
and solved it with the population dynamics algorithm. 



E. Numerical experiments on BP guided decimation 



We have run the belief propagation guided decimation algorithm for many random 4-sat formulas. The sizes of the 
formulas studied are N = 10'^, 3 x 10'^, 10^,3 x 10^, with a varying between 6.0 and 9.2. The number of formulas 
analyzed varies with a, but it is always larger than 2000 for N — 10'^, larger than 1200 for A'' = 3 x 10^, larger than 
400 for N = 10'' and between 360 and 25 (increasing a) for A^ = 3 x 10''. 
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FIG. 8: Fraction of assigned and logically implied variables in 4-sat random formulas , left: a = 7.0, right: a = 8.4. 
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FIG. 9: Phase diagram of 4-sat random formulas in the (a, 0) plane. The three lines in the bottom of the figure are those of 
Fig. 7, the upper two are 6'-{a) < 0'+{ct) defined in Sec. VD from the discontinuities of (j>{9) and tp{9). The filled circles show 
the location Smax(Q) of the slowest convergence of the BP iterations in the BP guided decimation algorithm, see Sec. VE2. 

1. Details on the practical implementation 

Some technical details about the numerical implementation of the BP guided decimation algorithm were given 
in [22] (see also Appendix A of [30] for details on the representation of BP messages and probabilities). The main 
numerical bottleneck in applying the BP guided decimation algorithm is the convergence of the iterative method for 
solving the BP equations, described in Section II D. This iterative scheme is known to be a fast way of finding a fixed 
point of the BP equations, although sometimes it may not converge. Lack of convergence may be due to different 
reasons: in case long range correlations develop, multiple BP fixed points appear and the convergence of BP to one of 
these fixed point cannot be guaranteed; on the other hand, when a single BP fixed point exists, convergence problems 
can be typically cured by the use of a damping term [20]. In all our numerical simulations we have used a damping 
term of intensity 0.1, that is, when we update a message x, we do not assign to it directly the new value xNew, but 
rather the weighted sum 0.9 * xNew + 0.1 * x. We have verified that under these conditions the convergence (if 
any) is always exponentially fast in the number of iterative steps (although sometimes with an exponent very small). 
Because of the exponentially fast convergence, our arbitrary choice of considering BP equations solved when the 
maximal change in any BP message is below 10^'' turns out to be very reasonable: indeed an accuracy of 10~® can 
be reached by simply doubling the running time. Anyhow, in order to avoid entering a never-ending loop we have 
also fixed a maximum number of iterations equal to 1000; when this limit is reached non-converged BP messages are 
used to compute marginals and to proceed with the decimation. 

The last comment about technical issues regards the initialization of BP messages before the iterative solving 
procedure is applied. At the beginning, when the formula is still not decimated, BP messages are initialized in a 
random way assigning to each tanh^hi^a) a random value uniformly distributed between —0.2 and 0.2. After each 
variable decimation, one can choose to keep the BP messages obtained from the last iterative procedure or to re- 
initialize them along the same random way described above. In principle, if a single BP fixed point exists and if this 
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FIG. 10: Success probability of the BP guided decimation algorithm as a function of a for random 4-sat formulas of various 
sizes. The vertical line marks the value a* = 9.05 beyond which the analytical computations predicted a condensation transition 
in the residual entropy lo{0). 



is reached by the iterative method, then the starting point should be irrelevant. Moreover, one would expect that 
the BP fixed points of two formulas differing in just a variable are very close, and that starting from the one already 
reached should help convergence (with respect to start from random messages). This intuition turns out to be wrong. 
We have strong numerical evidence that a random re-initiahzation of BP messages after each decimation strongly 
enhances the performances of the algorithm. A possible explanation is the following. Our numerical procedure does 
not produce a perfect estimation of the marginal probabilities (in particular when the stopping criterion used is the 
maximal number of iterations); if messages are not re- initialized small errors may easily accumulate in the same 
direction, while a random re-initialization of BP messages results in a partial neutralization of these errors. 



2. Algorithm performances and convergence probabilities 

As a first result, we show in Fig. 10 the success probability for the BP guided decimation algorithm, i.e. the fraction 

of formulas which have been solved by this algorithm. The numerical data clearly point to an algorithmic threshold 
tta very close to the theoretical prediction of the point a* = 9.05 (marked by a vertical line in Fig. 10) above which 
phase transitions occur in the thermodynamic properties of the decimated ensemble of random formulas. For a < a^, 
a large N formula is solved with positive probability by the BP guided decimation algorithm. The appearance of a 
jump in the function (l)(6) at a ~ 8.1 [see below for a more detailed analysis of 4>{0)], with a consequent avalanche 
of directly implied variables during the dc;c;imation of formulas with a > 8.1, docs not have any visible effect on the 
success probability. This phenomenon has however a trace in the random variable ^haitj that is the fraction of variables 
assigned before the discovery of a contradiction during the unsuccesful runs. The distribution of this random variable 
is shown in Fig. 11 for two values of a (below and above aa). One can see a maximum in this distribution for values 
of 9 slightly smaller than 9'^{a), the point of discontinuity of (j){0). 

In the following we are going to present data only in the region a < aa- In order to reduce finite size effects we 
will concentrate only on formulas which have been actually solved by our algorithm. The study of the convergence 
probability and of the average convergence time for the iterative method used to solve the BP equations provides very 
useful information, as it allows to identify the most difficult formulas, which should appear close to the threshold. In 
Fig. 12 we show both the probability that the BP fixed point is not reached after 1000 iterations (upper panels) and 
the average number of iterations required to converge (lower panels). Non-converged instances count with 1000 in the 
average. Four values of a are shown (from left to right), a = 7.0, 8.0, 8.4 and 8.8, and 9 > 0.5 is not shown since in 
that region nothing of interest takes place. 

We see that, for small values of a, by increasing the size of formulas the probability that BP does not converge in 
1000 steps reduces considerably, thus suggesting that in the large N limit the typical running time of BP is below 
1000 for any 9 value. On the contrary, for larger values of a, the probability that BP does not converge is not varying 
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FIG. 11: Distribution of the halting time of the BP guided decimation algorithm on 4-sat random formulas with a = 8.4 (left 
panel) and a = 9.2 (right panel). Vertical lines show the value of ^+(a) where (j){9) is discontinuous. 
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FIG. 12: Probability of non convergence in 1000 iterations (upper panels) and average number of iterations required to reach 
convergence (lower panels) for the BP part of the BP guided decimation algorithm, as a function of the fraction 9 of decimated 
variables. Prom left to right a = 7, a = 8, a = 8.4, a = 8.8. 



very much with N and seems to remain positive even in the large limit, thus suggesting that the typical number 
of iterations required to make BP converge is larger than 1000 for some values of 0. 

The overall picture we get from Fig. 12 is very clear. For any a value, the decimation procedure initially produces 
formulas which are more and more difficult to solve and the running time of BP thus increases with 9. The running 
time (or equivalently the probability of not converging in a fixed number of iterations) has a maximum at a value 
Omax{ct) and then decreases again. By increasing a, 9^ax{ce) decreases and the running time at ^max increases. It is 
natural to expect that the maximum running time should diverge at the threshold moreover, if one assumes that 
this phenomenon is related to the critical point a* marking the end of the (first-order) condensation transition line 
9c{o!) for a > a,, one should expect that 0max(tt) is a precursor of the transition line 9c{(y) in the phase a < q*. The 
data of 0max(Q;) plotted with filled circles in Fig. 9 are in agreement with this intuition, showing in particular that 



24 



0.25 



3 

o 
i_ 

c 
a> 

■g 
■<n 

0} 



0.15 




0.05 




e 



FIG. 13; Residual entropy as a lunction of the fraction 6 of decimated variables for a = 7 (upper curves) and a = 8.4 (lower 
curves). The symbols correspond to the analytical predictions presented in Sec. VC, the lines to the numerical simulations of 
the BP guided decimation algorithm for various sizes. 



^max(a) reaches values very close to 0^ for the largest values of a the algorithm is able to handle. 



3. Entropy of decimated formulas 

We measured the entropy of decimated formulas along the execution of the BP guided decimation algorithm, using 
the Bethe approximation stated in (10). When comparing these results with the analytic prediction presented in 
Section VC, a lot of care is required in dealing with cases where BP did not converge to a fixed point. Indeed in these 
cases the marginal probabilities do not satisfy the consistency constraints and the resulting value for the entropy may 
be quite far from the correct one. In order to avoid this problem we have adopted a drastic, but safe approach: we 
take the average over only those formulas for which the algorithm always converged before reaching the 1000 iterations 
limit. A possible criticism to this approach is that we are taking the average over the simplest formulas, thus obtaining 
a biased estimate for the residual entropy. If this criticism is well founded we should observe a dependence of the 
average residual entropy upon the value of maximal number of iterations. Actually we do not observe any variation 
by doubling the maximal number of iterations. 

In Fig. 13 we plot the residual entropy as a function of 9 for three different sizes (lines) together with the analytical 
predictions of Sec. VC (points with errors). For a = 7.0, even the = 10"^ data are almost superimposed to the 
analytic result. On the contrary, for a = 8.4 finite size effects are much more evident and only N = 10"^ data start to 
be compatible with the analytical computations in the thermodynamic limit. 

It is also worth noticing that the function shows its point of maximum curvature close to ^max- More in general 
the curvature of a;(^) seems to be somehow related to the typical running time of BP. 



It- Forced variables and multiple WP fixed points 



We also keep track of the fraction of logically implied variables in the partially decimated formulas, measuring it 
with the WP algorithm (equivalent to UCP as explained in Sec. II D). In the main panel of Fig. 14 we plot the function 
(f){9) computed analytically (solid curves). For a = 7.0 the function (f){9) is smoothly increasing and the numerical 
data follow this curve so nicely that only the N = 10"^ data is hardly visible, the rest being perfectly superimposed 
to the analytic curve. For a = 8.4 the function 4>{9) is multivalued in the interval [6'_{a), 9'_^_{a)] and the two curves 
plotted in the main panel correspond to the two branches (j){9) and tp{6) defined in Sec. VD. The numerical data 
for a = 8.4 are shown in the insets for clarity. The left inset corresponds to the decimation algorithm (which indeed 
increases during the run). The right inset reports the data gathered while running the backward algorithm which 
works as follows. 
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FIG. 14: The fraction (j){9) of assigned and logically implied variables for 4-sat formulas of density a = 7 and a = 8.4. The 
insets detail the results at a = 8.4 for the decimation and backward algorithm, see the text for details. 

After a succcsful run of the BP guided decimation algorithm we use the sohition r constructed as the reference 
one, and variables are unfixed one by one in the reverse fixing order: in this way at any 9 value the residual formula 
is exactly the same the decimation algorithm had to work with. The only differences between the two algorithms are 
the initial values for the WP and BP messages: in the backward algorithm all messages are set initially according 
to the solution foimd in 9 = 1, namely tanh hi-^a = —JfTi and \)i^a = 1(7^ = Jf ) for all edges. Then the updates 
of WP and BP are run as usual for the edges outside the decimated ones. The numerical data shown in the insets 
suggest that, although finite size effects are huge, in the large N limit the decimation (resp. the backward) algorithm 
follow the lower branch (j){9) (resp. upper branch '0(6')) curve shown in the main panel. As predicted by the analytical 
computation, for a > a'^ ~ 8.05 the fraction of variables which arc frozen (either assigned or directly implied by WP), 
(/>(0), has a hysteresis loop when the number of assigned variables 9 is increased and decreased across the interval 
[9'_{a),9'_^{a)\, the backward algorithm used when decreasing 9 corresponding to the Ii boundary condition of the 
infinite tree computation. The hysteresis loop obtained by looking at the WP messages is reported with a full line in 
Fig. 15 (where data for = 3 x lO** and a = 8.4 are used). 

The apparent paradox discussed at the end of Sec. VD shows up here again, at the level of the single sample 
analysis instead of the computation on the infinite tree: the forward and backward procedure allows us to construct 
two distinct fixed points of the WP equations on the same partially decimated formula (see points A and B in Fig. 15), 
while the success probability of the algorithm is still positive for this value of a and the residual entropy of Fig. 13 has 
no singularity. In addition to the above discussion on the non-commutativity between the projection from WP to BP 
and the limit of infinite depth tree/number of iterative updates, it is worth noting that with the initial condition used 
in the backward algorithm the Warning Propagation procedure corresponds actually to the whitening construction 
(see for instance [39]), starting from the solution found at the end of the forward algorithm. The interpretation of the 
number of frozen variables in A and B is thus different: A corresponds to the variables which are logically implied (in 
the UCP sense) in the partially decimated formula. On the other hand B counts the number of constrained variables 
in the core [39] of the reference solution of the partially decimated formula (the one reached by the decimation). The 
existence of distinct WP fixed points would be a sign of a positive SP complexity if one assumed these fixed points 
to be exponentially numerous. Even when this assumption is correct it does however not lead to a contradiction 
with the inexistence of thermodynamically relevant clusters or long-range correlations as defined in (23). The former 
corresponds indeed to a IRSB computation with Parisi parameter m = and the latter to m = 1. We checked indeed 
that computing the residual entropy with the backward algorithm yields the same result as with the decimation one, 
whereas in presence of an extensive thermodynamical complexity we would have obtained only the contribution from 
the internal entropy of the cluster around the reference solution. 

More indications come from the study of BP fixed point messages. In Fig. 15 we plot the fraction of variables 
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FIG. 15: Fraction of frozen variables as a function of 6, for a = 8.4. The full lines correspond to the WP results in the 
decimation and backward algorithm, the dashed lines to the analysis of the BP messages. 

that receive BP messages forcing it to a unique value (BP frozen) and the fraction of variables which are extremely 
biased under the BP messages, i.e. the less probable value has a probability smaller than 10^^" (BP prob< 10"^"). 
The fraction of BP frozen variables again shows hysteresis: in the decimation algorithm BP frozen variables perfectly 
coincide with WP frozen variables, while in the backward algorithm BP frozen variables are a little more than WP 
frozen variables because of the numerical impossibility of keeping marginals arbitrarily small. What is more interesting 
is that the fraction of extremely biased variables (BP prob< 10"^*^) does not show any sign of hysteresis: the same 
smooth curve is followed by the decimation algorithm as well as by the backward algorithm. This observation 
suggests that BP marginals obtained by the two algorithms are exactly the same, except for (almost) completely 
frozen variables. 



F. Large k behaviour 

We have seen that the behaviour of the ensemble of decimated random formulas is richer in the satisfiability case than 

for xor-satisfiability. with in particular the appearance of two sets of critical lines, one describing the thermodynamic 
properties, uj{0), and the other the singularities of the logical implications, <p{0). The common wisdom is however that 
when the length k of the clauses gets large the satisfiability model gets simpler, notably allowing some tight rigorous 
results [40 43] , and in fact becomes very similar to the xor-satisfiability one. We shall hence briefly discuss now the 
large k limit of our results for the decimated ensembles. 

Let us begin with the xor-satisfiability case; the results of Sec. IV having an explicit form, they can be easily 
turned in asymptotic expansions for large k. For instance the threshold a* given in Eq. (33) is found to behave as 
f (1 + 0(fc^^)), while the clustering threshold cud is ^(1 + 0(lnlnA:/lnfc)) and the condensation threshold Uc goes 
to 1 in the large k limit. One can also study the behaviour of the transition line Oc{a) in the last of these three 
asymptotic scales (i.e. for a constant with respects to k). After a short computation, which consists in expanding 
the fixed point solutions <j){6) and 'tp{9) of Eq.(32), and the associated entropies (35), one finds that the two leading 
orders are 9c{a) ^ 1 — q — ae^"*^. 

Large k asymptotic expansions of the non-decimated {6 = 0) ensemble of fc-satisfiability were performed for the 
clustering and condensation thresholds in [30] and in [9] for the satisfiability one. The leading order of the clustering 
threshold is 2''^^, while both condensation and satisfiability occurs for values of a around 2*^ In 2 (see [9, 30] for 
the subleading corrections, which are different for etc and as). Consider now the fraction of decimated or implied 
variables 0(^), computed for the satisfiability ensemble according to Eqs. (51,52). In the large k hmit a crucial 
simplification occurs thanks to the concentration, at the leading order, of the h and u random variables solutions of 
the RS equation (44) around 0. From this fact follows easily (compare with Eqs. (30,31)) that the functions (/)((?) and 
■0(0), for satisfiability random formulas of clause density a, approach the corresponding functions of xor-satisfiability 
formulas of clause density a/ 2''. In consequence this correspondence holds for the lines 0'j.{a), and the critical point 
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FIG. 16: Critical lines 0±{a) and 9±{a) for fc-satisfiability, left: k — 4, right: k — 5. The dashed lines are the fc-xorsatisfiability 
critical lines, with a rescaling of 2*^ on their clause density. 



k 




9, 


a: 


e: 


4 


9.05 


0.045 


8.05 


0.35 


5 


16.8 


0.188 


14.7 


0.46 



TABLE I: Values of the thresholds q, and q^, and the associated fraction of decimated variables 9, and 9^, for random 
fe-satisfiability formulas, fc = 4, 5. 

a'^ of satisfiability is expected to scale as + 0{k^^)). The asymptotic study of the thermodynamic lines 0±{a) 

is slightly more involved because of the continuous nature of the second member of the pair in the random variable 
(/i, h'^), whereas it can only take two values in {h, l)"^). One can however notice that a consistent Ansatz in the large 
k limit is to assume (/i, tanh /i'^) « (/i,t()'^), that is all non strictly forcing messages are approximated as completely 
unbiased. If this hypothesis is correct the distinction between 9+{a) (resp. 9-{a), a*) and 0'^{a) (resp. 9'_{a), a'^) 
should vanish in the large k limit. We have not attempted to obtain a formal proof of this statement but repeated 
the determination of the satisfiability phase diagram for fc = 5. The results presented in Fig. 16 (and in Tab. I for 
the values of the thresholds) confirms the intuition stated above. The two sets of critical lines are much closer for 
k — 5 than k — 4, and also in better agreement with the xor-satisfiability values (dashed lines, with a rescaling factor 
of 2*"' on the a axis). Finally an expansion of the residual entropy at the leading order leads us to conjecture that 
asymptotically 9c{a) ~ 1 — ^ , as obtained explicitly in the xor-satisfiability case. The datas of 9c{a) obtained by 
the population dynamics algorithm for fc = 5 (not shown) are already in good agreement with this asymptotic form. 

We have also performed BP guided decimation simulations for fc = 5, and found an algorithmic threshold aa 
between 16 and 16.5. This range is clearly above the appearance of the jump in 4'{9), thus confirming the results 
presented in Sec. VE2 for fc = 4, but it also a little bit below the thermodynamic triple point (mainly because for 
fc = 5 the constraint on the maximum number of iterations produces a more drastic effect). 

VI. CONCLUSIONS 

In this article we have introduced analytical tools that allows computations similar to the Franz-Parisi quenched 
potential for diluted mean-field systems. We have precisely defined ensembles of partially decimated random CSP 
and refined their analytical description initiated in [22] . These methods have been applied to xor-satisfiability random 
formulas, putting known results [28, 29, 33, 34] in a slightly different perspective, and to the satisfiability case which 
presents a much richer phase diagram. These computations are expected to describe the behaviour of an hypothetic 
ideal decimation algorithm based on an oracle able to compute exact marginal probabilities in large graphical models. 

We have then confronted these results with the outcomes of extensive simulations of the BP guided decimation 
algorithm, which is a practical, approximate implementation of the ideal procedure. In the case of xor-satisfiability 
formulas the interpretation of the comparison is very clear and can in fact be confirmed by rigorous calculations. 
The satisfiability problem is much more difficult; the interpretation of the results of the BP decimation should be 
based on a precise description of the "small" errors made by BP, which somehow accumulate along the decimation 
for large enough values of a and cannot avoid conflicting choices in the decimated variables. Lacking such a refined 
control of BP we have to turn to a more intuitive explanation, based on the analysis of the ideal algorithm. The 



28 



algorithmic threshold aa for BP decimation on random 4-satisfiability formulas is found to be very close to the value 
a* above which clustering and condensation transitions do occur in the (a, 9) plane. One is thus led to conjecture 
more generally that the presence of a condensed regime, in which BP is expected to fail because of replica symmetry 
breaking effects, will coincide with the BP decimation threshold for generic CSP. 

It is fair to say that we first found the numerical results reported in this work quite surprising. We initially 
expected [22] the BP guided decimation algorithm to fail when (p{6) develops a jump, that is when the assignment of 
a single variable produces an avalanche of 0{N) forced variables: in this situation, we expected that a contradiction 
would be generated with high probability. Our numerical results clearly show this is not the case: the algorithm we 
have studied is able to fix at the same time a finite fraction of variables without entering a contradiction. Moreover, 
as soon as a thermodynamical condensation transition is taking place, for a > a*, the success probability falls down 
to zero sharply. The most natural explanation to these observations is that the marginals used by the algorithm 
to fix variables are extremely close to true marginals for any a < a* (and this is clearly a very good news for 
the use of BP even very close to the clustering threshold) and not so good above a*. Still some small differences 
between BP marginals and true marginals are expected even below a,, mainly given by corrections to the Bethe 
approximation [44, 45] and to the precision used to solve BP equations (e < 10^^ in our case). Then, why these small 
differences between true marginals and BP estimations do not affect the success probability of the algorithm before 
a*, while they become relevant above a*? Maybe because the nature of these errors changes crossing a*. Roughly 
speaking, below a, they have a statistical origin and produce random perturbations of intensity and e (in a 
sense that should be precised): if these errors are largely uncorrelated, when summing 0{N) of these we still get 
errors of order 1 / y/N and e^/N, which are very small. On the contrary, above a* , deviations from true marginals are 
systematic, because of long range correlations: in this case errors are strongly correlated and summing 0{N) of these 
produces a contradiction with high probability. 

Let us also sketch a few possible directions for future research. Apart from the computation of the usual Franz- 
Parisi potential, the analytical formalism can be adapted to other CSP models besides the satisfiability and xor- 
satisfiability cases treated here. For instance the case of coloring should be relatively easy because of the triviality of 
the RS description (as in xorsat), and relates to the recent study of [46, 47]. It would also be interesting to perform 
simulations of the BP decimation algorithm on coloring formulas (this was done in [48] but with a bias in the choice 
of the decimated variable), and check whether our conjecture on the closeness of a^, and a, holds in this case. 

A more rigorous analysis of the BP guided decimation algorithm would be welcome, and should be easier to 
perform in the large k limit. Until very recently the highest clause densities where algorithms were rigorously shown 
to succeed in finding solutions were 0(2'^/fc) [36]. Our rough analysis suggest that the threshold of success for BP 
guided decimation should be on that scale too, with Q;a(fc) ^ e2^ /k. It has been shown very recently in [42] that 
a polynomial time algorithm can find solutions of formulas with densities up to 2^h\k/k, which corresponds to the 
scale of the dynamic threshold Q!d(fc). 

On the algorithmic side many variations on the simplest procedure studied in this paper are more efficient (and 
notably the Survey Propagation algorithm [7, 16]), yet seems much more difficult to tackle analytically. A slight 
modification of the BP decimation algorithm where the order of the assignments is not uniformly random but treats 
in priority variables with the most biased marginals already changes substantively the highest value of a where 
formulas are solved with positive probability [11]. Other interesting directions to explore would be more eflacients 
procedures for the resolution of the BP equations (for instance double- loop algorithms [49]), the use of reinforcement 
strategies [50] instead of explicit decimation, or the coordinated decimation of groups of variables [51]. 
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APPENDIX A: DETAILS ON THE COMPUTATIONS OF SEC. IV C 

We present in this appendix some details on the computations discussed in Sec. IV C. The properties of decimated 
xorsat formulas will be derived through the analysis of two algorithms which act on the formula and which can be 
described by the differential equation method [52-54]. Two successive steps will be performed : the logical implications 
of the decimation of a fraction 6 of the variables are first drawn with Unit Propagation. Then the structure of the set 
of solutions of the reduced formula is analyzed by the leaf removal algorithm. This approach has been developed for 
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fc-xorsat formulas in [28, 29] and generalized to arbitrary degree distributions in [33], we reproduce here their results 
for the sake of self-containedness. 



1. Unit propagation 

As explained at the beginning of Sec. IV C wc can assume here the formula to be an unfrustratcd (Ja = 1) set of 
M = aN equations of the form (1), each involving k indices chosen uniformly at random among the N variables. A 
fraction of the variables are then set to +1, and can then be removed from the clauses where they appeared. Let us 
call i?„ = Npii the number of clauses with k non decimated variables, and Li = NXi the number of variables which 
appears in I clauses (note that a decimated variable does not appear in any clause after the above simplification step). 
One obtains: 

Xi = {i-0)e-^'^^-^ + e5ifl , (Al) 
= qQ^ (1 - 6i)«6l'=-'^ for K<k . (A2) 

Consider now the action of the Unit Propagation algorithm. As long as clauses of length n = 1 are present in the 
formula, one of them is chosen randomly, the single variable it contains is fixed to +1, and is then removed from the 
other clauses it appeared in. The formula obtained after T steps of this procedure is uniformly random, conditional 
on the values of {R^iT), Li{T)}, so that the analysis of the process amounts to follow these random variables. At 
each time step T ^ T + 1 they vary by a bounded random increment whose distribution depends only on the 
current values {Rk{T), Li{T)} and not on the previous history of the process. In consequence the reduced quantities 
PK.{t) = Rk.{T = Nt)/N and Xi{t) = Li{T = Nt)/N concentrate around their average values [52-54], solutions of the 
following set of differential equations: 

-T:h{t) = ^1,0 - ^ „, , (A3) 



dt El' I' ^i' it) 



(K+l)p„+l(t) KPf,{t) 



(A4) 



These expressions arise because the variable selected in the unit clause is present in I clauses with probability propor- 
tional to lXi{t); apart from the unit clause, the I — 1 other occurences of the variable take place in clauses of length k 
with probability proportional to Kp^{t). 

In order to solve these equations we introduce the generating function of the initial distribution of degrees of the 
variables, A(a;) = XixK Equation (A3) is solved for any Z > 1 by Xi{t) = Xia{ty, where a{t) is the solution of 

^a(t) = — \ , with the initial condition aCt = 0) = I . (A5) 
dt X {a{t)j 

One can then insert this expression of A; {t) in (A4) and solve to obtain 

= S (:>.' (^)" (l - ^)""" - &,.A'(«(*))(1 . (A6) 

The differential equations (A3,A4) only make sense if pi{t) > : the procedure stops when no more unit clause can 
be selected, i.e. at the reduced stopping time = min{< : pi{t) — 0}. 

For the initial degree distribution given in (Al) one obtains for the derivative of the generating function: A' (a;) = 
ak{l — 9)e~"'^'^^~''\ hence by integration of (A5) 

a{t) = l- \ \n( ) . (A7) 



ak \l-9-t^ 

Plugging this result in the expression (A6) of pi{t), one finds that t* is the smallest solution of 

ak{0 + tr-'=\nf^^^\ . (AS) 
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Defining finally (j> — 6 + t^,, one realizes that (/> is the smallest solution of Eq. (32): this quantity gives the fraction 
of variables that are either fixed by the decimation or by the propagation of logical implications. When all these 
implications are taken into account, the degree distributions of the reduced formula reads 

X^iU) = (1 - ^)e-^(-^-^)Mi_Q! > 1 , (A9) 
= af^Vl-.^)''/-'' forK>2. (AlO) 



In consequence the number of non-trivial clauses is 

k 



M' = 7V^p«(f,)=a7V(l-/-fc(l-<^)/-i) . (All) 



«=2 



2. Leaf removal 



Wc shall now analyze the set of solutions of random unfrustratcd formulas with degree distributions given by 
(A9,A10). Following [28, 29], wc consider the action of the leaf removal algorithm on such a hypergraph. Each leaf 
removal step consists in picking at random one variable of degree 1 (a leaf), and remove the single clause it appeared 
in. This simplification is repeated until no leaf is left in the graph, which provok(^s the stopping of the algorithm. 
There are two possible situations at that point: cither all clauses have been removed, or there remains a non empty 
2-core, that is the maximal subgraph of the original formula in which all variables have degree at least 2. In both 
cases the total entropy is given (in units of log 2) by the initial number of variables minus the initial number of clauses. 
In the former case the set of solutions is unclustered, while in the latter the solutions are split into an exponential 
number of clusters. Each cluster corresponds to one solution of the 2-core formula, hence the complexity, i.e. the 
exponential rate of the number of clusters, is given by the difference between the number of variables and of clauses 
in the 2-core. Each cluster contains an exponential number of solutions, this internal entropy being associated to the 
degeneracy arising from the freedom in the choice of the value of the leaf variables when reinserted in the formula in 
the reverse order with respect to their removal. 

The evolution of the degree connectivities during the execution of the leaf removal algorithm can be computed in 
a very similar manner with respect to the Unit Propagation case sketched above. With a slight abuse of notation we 
denote again NXi {t) and iVp^ (t) the average number of variables and constraints of degrees I and k after Nt steps of 
the leaf removal algorithm. These quantities obey the following set of differential equations: 

:P^{t) = -^^—7—77: ' (A12) 



—Xi{t) = + c>i,o + ' — 



{l + l)\i+i{t) IXiit) 



(A13) 



These equations are essentially the same as (A3,A4) with the role of A and p being exchanged (in the leaf removal 
algorithm one picks a variable of degree 1, in Unit Clause Propagation it is a clause of degree 1). They can thus be 
solved with the same technique. Let us define the generating function of the clause lengths at the beginning of the 
leaf removal, p{x) — J2k Pi^^'^- -^^ all times pK.{t) = pJ){tY, where h{t) is solution of 

i'^'^=-7m^ withKi=o) = i. (A14) 

The distribution of the variable degrees is then found to be 

The stopping time of the leaf removal algorithm is given by = min{f : X\{t) = 0}. A non-trivial 2-core exists at this 
stopping time if and only if 6(t*) > 0. As the function h(i) is decreasing in time (see Eq. (A14)), the value 6* = 6(f,) 
can also be defined as the largest solution in [0, 1] of 

.'(6.)(i-W = ^£a,(i-^)". (AI6) 
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Let us now apply these results to the degree distributions (A9,A10) of the formula obtained at the end of the Unit 
Propagation. These imply the following form of the derivative of the clause length generating function: 

p'(x) = a/e(l-<^)((<^ + a;(l-0))'=-i . (A17) 

The solution of the equation (A16) is either 6, = 0, or the largest strictly positive solution of 



1 — 6* = exp 



afc((/) + 6,(l-(/.))'=-^ -/-^ . (A18) 



Defining 6* = {ip — (p) / {1 — (p) , one realizes that the equation on 6* is equivalent to ip being the largest solution of 
Eq. (32). We have thus justified one of the statement made in Sec. IV C : when there is only one solution to Eq. (32), 
tp = (j) OT in other terms 6* = 0. This means that the leaf removal algorithm does not stop before having emptied 
the complete formula, there is no 2-core and the solution space is not clustered. On the contrary the existence of the 
multiple solutions tp > (p corresponds to 6* > and hence to the presence of a non-trivial 2-core in the hypergraph of 
constraints. This latter case corresponds to a > a* and 6 S [9 -.{a), 6+ (a)]. 

Let us conclude with the justification of the expressions of the entropy and complexity given in Sec. IV. We have 
seen that the number of non-implied variables at the end of the Unit Propagation procedure is A''(l — <p), while the 
number of non-trivial clauses is given in Eq. (All). In the region of the (a, 9) plane where there is a single solution of 
Eq. (32) the entropy (35) is given (in units of In 2) by the difference between the number of variables and constraints, 
in agreement with the results of [28, 29]. When a non-empty 2-core subsists at the end of the leaf removal algorithm, 
one can compute from the solution of (A12,A13) at the stopping time the number of variables and clauses in the 
2-core: 

iVeore = - </> - aA;(l - V')(V''"' - /"')] , (A19) 

Mcore = Nayj''-cp''-kcp''-\iP-(P)] . (A20) 

It is then easy to verify that Ld((p) — uj^ip) = ln(2)(A^corc — Mcorc)/N. When this quantity is positive, that is in the 
interval [9-{a),9c{a)], it is equal to the entropy density associated with the exponential number of solutions of the 
2-core. On the contrary when it is negative the 2-core with more clauses than variables has only a subexponential 
number of solutions (recall that we conditioned from the beginning on the formula being satisfiable and on the reference 
configuration unveiled being a solution) . One can show in this case that the entropy arising from the variables outside 
the 2-core is given by oj{tp), hence the total entropy of the decimated formula is always given by max[a}(0), cu{tp)], the 
largest branch as plotted in the right panel of Fig. 4. 
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